// MCP TOKEN COST

The Smallest MCP server costs 13,489 tokens before the first call.

Connect Smallest and its 70 tool definitions are loaded into the model's context on every request — 6.7% of a 200k window spent before your agent does anything.

QUICK ANSWER The Smallest MCP server's tool definitions consume 13,489 tokens — 7.1× the median MCP server (1,905 tokens). A scoped grant exposing only the tools you use cuts that roughly in proportion.

MEASURED FROM SCHEMAS 70 tools · 13,489 tokens · 6.7% of 200k · 1.3% of 1M Method →

// CONTEXT WINDOW SHARE

What that buys before your agent starts working.

Tool definitions are overhead: they occupy context on every request and compete with your code, documents and conversation history for the same window.

200K WINDOW 6.7%

1M WINDOW 1.3%

Corpus context: Smallest ranks #111 of 3,213 measured MCP servers by definition cost. The median is 1,905 tokens, p90 is 7,952, and the heaviest (Fusionauth) is 183,337 — 92% of a 200k window on its own.

// PER-TOOL BREAKDOWN

Where the 13,489 tokens go.

Each row is one tool definition as a tools/list entry — name, description and input schema — counted with o200k_base. Average: 193 tokens per tool.

Tool	Category	Tokens	% of server
update_agent_config	Write	1,462	10.8%
create_agent	Write	1,022	7.6%
text_to_speech	Write	403	3.0%
transcribe_audio	Read	391	2.9%
list_calls	Read	369	2.7%
create_campaign	Write	307	2.3%
make_call	Read	283	2.1%
get_credit_ledger	Read	266	2.0%
get_agents	Read	244	1.8%
get_usage_stats	Read	238	1.8%
get_dashboard	Read	226	1.7%
get_attempt_cohorts	Read	225	1.7%
get_campaigns	Read	225	1.7%
get_hourly_performance	Read	224	1.7%
get_agent_performance	Read	221	1.6%
get_pickup_rates	Read	221	1.6%
get_duration_stats	Read	217	1.6%
get_weekly_trends	Read	217	1.6%
compare_version_metrics	Read	216	1.6%
get_call_volume	Read	216	1.6%
get_call_outcomes	Read	215	1.6%
get_phone_number_trends	Read	215	1.6%
publish_draft	Write	213	1.6%
get_call_counts_by_day	Read	212	1.6%
update_version	Write	203	1.5%
test_version	Read	195	1.4%
list_versions	Read	189	1.4%
test_draft	Read	189	1.4%
add_audience_members	Write	177	1.3%
update_billing_alerts	Write	177	1.3%
search_audience_members	Read	167	1.2%
debug_call	Read	166	1.2%
get_voices	Read	161	1.2%
update_agent_prompt	Write	159	1.2%
get_draft	Read	158	1.2%
invite_member	Write	151	1.1%
get_credit_usage	Read	149	1.1%
get_draft_diff	Read	143	1.1%
diff_versions	Read	141	1.0%
get_audience_members	Read	140	1.0%
duplicate_agent	Read	134	1.0%
rename_draft	Write	134	1.0%
delete_audience_members	Destructive	132	1.0%
export_campaign_logs	Write	126	0.9%
get_call_start_distribution	Read	123	0.9%
get_daily_summary	Read	123	0.9%
activate_version	Write	123	0.9%
get_concurrency	Read	119	0.9%
get_version	Read	116	0.9%
list_drafts	Read	102	0.8%
delete_agent	Destructive	100	0.7%
start_campaign	Execute	99	0.7%
get_plans	Read	88	0.7%
get_agent	Read	87	0.6%
validate_coupon	Read	87	0.6%
delete_audience	Destructive	86	0.6%
get_campaign	Read	86	0.6%
pause_campaign	Read	85	0.6%
get_agent_prompt	Read	83	0.6%
redeem_coupon	Write	83	0.6%
get_audience	Read	80	0.6%
delete_campaign	Destructive	73	0.5%
get_audiences	Read	73	0.5%
get_usage_breakdown	Read	71	0.5%
get_phone_numbers	Read	70	0.5%
get_payment_methods	Read	65	0.5%
get_auto_reload	Read	61	0.5%
get_billing_alerts	Read	59	0.4%
get_invoices	Read	56	0.4%
get_credit_balance	Read	52	0.4%

// SCOPED GRANT

Most agents use a handful of these tools. They pay for all 70.

A PolicyLayer grant exposes only the tools you allow — ungranted definitions are filtered out of the tool list, so they never enter the context window. Estimates below assume typical-weight tools (193 tokens each).

Grant scope	Definition cost	Reduction
All 70 tools (no gateway)	13,489 tokens	—
3 granted tools	~578 tokens	−96%
5 granted tools	~964 tokens	−93%
10 granted tools	~1,927 tokens	−86%

Model your own stack in the token-cost calculator, or see the Smallest policy for what a sensible grant looks like.

//FAQ

Smallest token-cost questions.

How many tokens does the Smallest MCP server use?+

Its 70 tool definitions total 13,489 tokens — 6.7% of a 200k context window — measured with tiktoken o200k_base over the serialised tools/list payload. Exact counts vary slightly by client and model.

Why does Smallest consume tokens before I send a message?+

MCP clients load every connected server's tool definitions — name, description, and input schema — into the model's context so it knows what it can call. That payload is charged against your context window on every request, whether or not a tool is used.

How do I reduce Smallest's token usage?+

Expose fewer tools. A PolicyLayer grant scopes Smallest to only the tools you allow — ungranted definitions are filtered out of the tool list, so they never enter the context window. A grant of 3 typical tools costs roughly 578 tokens, a 96% reduction.

Does deferred tool loading fix this?+

Partially, in some clients. Claude Code defers MCP tool schemas behind a tool-search step by default, and VS Code has experimental grouping — but you still pay tokens per search and reload, and Cursor, Windsurf and Gemini CLI load definitions upfront. Reducing the exposed tool set cuts the cost in every client.

// METHOD & REVIEW

How these numbers were measured.

01

Serialisation

Each tool is serialised as a tools/list entry — name, description, input schema — from the schemas in the PolicyLayer scan database. Clients differ slightly in framing, so treat counts as close estimates.

02

Tokeniser

tiktoken o200k_base (GPT-4o/o-series). Anthropic's current tokeniser isn't published, so Claude's exact counts will differ; for English text and JSON schemas the totals are close enough to treat these as estimates.

03

Deferred loading

Some clients now defer schema loading (Claude Code's tool search; VS Code experimental grouping). You still pay per search and reload — and Cursor, Windsurf and Gemini CLI load everything upfront.

Computed 07-06-2026 from the PolicyLayer scan database over all 70 catalogued Smallest tools. Counts refresh with every site build.