Home / Token cost / Switch

The Switch MCP server costs 6,075 tokens before the first call.

Connect Switch and its 29 tool definitions are loaded into the model's context on every request — 3.0% of a 200k window spent before your agent does anything.

QUICK ANSWER The Switch MCP server's tool definitions consume 6,075 tokens — 3.2× the median MCP server (1,905 tokens). A scoped grant exposing only the tools you use cuts that roughly in proportion.

MEASURED FROM SCHEMAS 29 tools · 6,075 tokens · 3.0% of 200k · 0.6% of 1M Method →

What that buys before your agent starts working.

Tool definitions are overhead: they occupy context on every request and compete with your code, documents and conversation history for the same window.

200K WINDOW 3.0%
1M WINDOW 0.6%

Corpus context: Switch ranks #960 of 3,213 measured MCP servers by definition cost. The median is 1,905 tokens, p90 is 7,952, and the heaviest (Fusionauth) is 183,337 — 92% of a 200k window on its own.

Where the 6,075 tokens go.

Each row is one tool definition as a tools/list entry — name, description and input schema — counted with o200k_base. Average: 209 tokens per tool.

ToolCategoryTokens% of server
generate_video Write 1,378 22.7%
generate_image Write 815 13.4%
lip_sync_video Write 479 7.9%
talking_avatar_video Destructive 255 4.2%
apply_movie_scene Write 204 3.4%
apply_travel Write 198 3.3%
apply_high_fashion_editorial Write 197 3.2%
voice Destructive 190 3.1%
apply_iphone_realism Write 185 3.0%
upload_media Write 168 2.8%
get_my_active_references Read 159 2.6%
apply_ugc Write 158 2.6%
apply_wellness Write 157 2.6%
apply_magic_hour_portrait Write 144 2.4%
apply_cinematic_anamorphic Write 143 2.4%
apply_product Write 140 2.3%
apply_graphic_editorial_portrait Write 123 2.0%
list_my_videos Read 110 1.8%
list_generations Read 108 1.8%
get_video_status Read 103 1.7%
show_media Read 100 1.6%
check_balance Read 79 1.3%
check_job_status Read 78 1.3%
list_video_models Read 75 1.2%
search_my_library Read 75 1.2%
show_generation Read 75 1.2%
list_my_assets Read 74 1.2%
explore_models Read 55 0.9%
list_my_folders Read 50 0.8%

Most agents use a handful of these tools. They pay for all 29.

A PolicyLayer grant exposes only the tools you allow — ungranted definitions are filtered out of the tool list, so they never enter the context window. Estimates below assume typical-weight tools (209 tokens each).

Grant scopeDefinition costReduction
All 29 tools (no gateway) 6,075 tokens
3 granted tools ~628 tokens −90%
5 granted tools ~1,047 tokens −83%
10 granted tools ~2,095 tokens −66%

Switch token-cost questions.

How many tokens does the Switch MCP server use?+

Its 29 tool definitions total 6,075 tokens — 3.0% of a 200k context window — measured with tiktoken o200k_base over the serialised tools/list payload. Exact counts vary slightly by client and model.

Why does Switch consume tokens before I send a message?+

MCP clients load every connected server's tool definitions — name, description, and input schema — into the model's context so it knows what it can call. That payload is charged against your context window on every request, whether or not a tool is used.

How do I reduce Switch's token usage?+

Expose fewer tools. A PolicyLayer grant scopes Switch to only the tools you allow — ungranted definitions are filtered out of the tool list, so they never enter the context window. A grant of 3 typical tools costs roughly 628 tokens, a 90% reduction.

Does deferred tool loading fix this?+

Partially, in some clients. Claude Code defers MCP tool schemas behind a tool-search step by default, and VS Code has experimental grouping — but you still pay tokens per search and reload, and Cursor, Windsurf and Gemini CLI load definitions upfront. Reducing the exposed tool set cuts the cost in every client.

How these numbers were measured.

01
Serialisation

Each tool is serialised as a tools/list entry — name, description, input schema — from the schemas in the PolicyLayer scan database. Clients differ slightly in framing, so treat counts as close estimates.

02
Tokeniser

tiktoken o200k_base (GPT-4o/o-series). Anthropic's current tokeniser isn't published, so Claude's exact counts will differ; for English text and JSON schemas the totals are close enough to treat these as estimates.

03
Deferred loading

Some clients now defer schema loading (Claude Code's tool search; VS Code experimental grouping). You still pay per search and reload — and Cursor, Windsurf and Gemini CLI load everything upfront.

Computed 07-06-2026 from the PolicyLayer scan database over all 29 catalogued Switch tools. Counts refresh with every site build.

Expose only the tools you use — the rest never enter your context.

A PolicyLayer grant scopes Switch to the tools you actually allow. Ungranted definitions never load, and every call that does run is checked against policy first.

Free to start. No card required.

4,600+ MCP servers and 31,000+ tools scanned and risk-classified.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.