What is MCP Token Cost?
MCP token cost is the context-window overhead incurred by connecting MCP servers: every connected server's tool definitions — names, descriptions, and input schemas — are injected into the model's prompt, consuming tokens on every request whether or not the tools are used.
WHY IT MATTERS
For the model to know what it can call, the client serialises each server's tools/list output into the prompt. A single verbose tool — long description, large JSON schema — can cost hundreds of tokens; a server exposing dozens of tools can cost thousands; a typical multi-server setup can consume a five-figure token count before the user types anything. That overhead recurs on every request in the session.
The costs compound in three ways. Money: those tokens are billed as input on every call. Latency: larger prompts take longer to process. Quality: tool definitions compete with actual work for context window space, and models pick tools less reliably as the catalogue grows — the tool sprawl problem measured in tokens.
Mitigation options, roughly in order of effort:
- Connect fewer servers — disconnect servers a project does not use; the cheapest token is one never injected.
- Tool filtering — expose only the subset of a server's tools you actually call, where the client or an intermediary supports it.
- Virtual servers — aggregate many upstreams behind one curated facade exposing a minimal tool set (see MCP virtual server).
- Measurement — count the serialised definition size per server so trimming decisions are data-driven rather than guesswork.
HOW POLICYLAYER USES THIS
PolicyLayer publishes per-server token-cost data at policylayer.com/token-cost, measured from each scanned server's actual tool definitions, so you can see what a server costs your context window before connecting it. The gateway's virtual server support lets teams expose a filtered tool subset from registered upstreams, cutting injected definitions to the tools policy actually permits.
IN THE CATALOGUE
Measured across 3,105 MCP servers (56,764 tools): connecting a server loads its full tool definitions into the context window on every request.
| Server | Tool definitions | Tokens per request |
|---|---|---|
| GitHub | 86 | 14,406 |
| Linear | 66 | 7,149 |
| Supabase | 29 | 2,561 |
| Filesystem | 14 | 1,642 |