Every connected server's tool definitions — names, descriptions, JSON schemas — are loaded into the model's context on every request. GitHub + Slack + Linear + Supabase together: 65,304 tokens, 33% of a 200k window, before the first message.
Pick servers from the catalogue or paste your mcpServers config. Untick the
tools you don't use to see what a scoped grant saves.
Nothing selected yet. Add servers above — the totals update live.
Every measured server has its own breakdown — headline cost, per-tool table, and what a scoped grant saves. The popular names and the heaviest of the catalogue:
1,659 servers measured in total — search the full set in the calculator above, or browse the tool catalogue.
An MCP client sends each connected server's tools/list — name, description, full JSON input schema per tool — to the model so it knows what it can call. Connecting a server means paying for all of it.
Stacks accumulate: a few popular servers reach tens of thousands of tokens of definitions. That space competes directly with your code, documents and conversation history.
Agents typically call a handful of tools per session, but pay for every definition on every request. The fix is structural: expose only the tools you actually grant.
Deferral trades upfront cost for per-use cost: tool searches and schema reloads are themselves charged against the window. And several major clients don't defer at all.
| Client | Tool definition loading |
|---|---|
| Claude Code | Defers MCP schemas behind tool search by default — you pay per search and per reload |
| VS Code (Copilot) | Experimental tool grouping — partial deferral, off by default |
| Cursor | Loads all tool definitions upfront, every request |
| Windsurf | Loads all tool definitions upfront, every request |
| Gemini CLI | Loads all tool definitions upfront, every request |
Client behaviour verified 04-06-2026. A scoped grant cuts the cost either way: fewer definitions to load upfront, fewer to search and reload when deferred.
Across 1,659 servers with complete schema coverage in the PolicyLayer scan database, definition cost is heavily long-tailed.
The heaviest measured server (Fusionauth) consumes 92% of a 200k context window with tool definitions alone. Popular productivity servers cluster well above the median — GitHub + Slack + Linear + Supabase average 16,326 tokens each.
It depends entirely on the servers. Across 1,659 measured servers the median is 1,075 tokens of tool definitions, but the heavy hitters dominate: GitHub + Slack + Linear + Supabase together consume 65,304 tokens — 33% of a 200k window — before the first message.
MCP clients send every connected server's tool definitions — name, description, and JSON input schema — to the model so it knows what it can call. That payload counts against the context window on every request, whether or not any tool is used.
It helps in clients that support it, but it is not free: each tool search and schema reload costs tokens, and definitions still enter context once loaded. Cursor, Windsurf and Gemini CLI load all definitions upfront. Cutting the exposed tool set reduces cost in every client.
Expose fewer tools. Routing servers through a PolicyLayer grant means only the tools you explicitly allow are visible to the client — ungranted definitions never enter the context window, and every call that does run is policy-checked.
Each tool is serialised the way a tools/list response carries it ({name, description, inputSchema}) using schemas from the PolicyLayer scan database, then counted with tiktoken o200k_base. Clients vary slightly in serialisation, so treat counts as close estimates rather than exact invoices.
Each tool is serialised as a tools/list entry — name, description, input schema — from the schemas in the PolicyLayer scan database. Clients differ slightly in framing, so treat counts as close estimates.
tiktoken o200k_base (GPT-4o/o-series). Anthropic's current tokeniser isn't published, so Claude's exact counts will differ; for English text and JSON schemas the totals are close enough to treat these as estimates.
Only servers with (near-)complete schema coverage are measured — 1,659 of the catalogue. Partial coverage is disclosed per page rather than estimated away.
Computed 05-06-2026 from the PolicyLayer scan database. Counts refresh with every site build. Sources: the MCP specification (tools/list), tiktoken, and our State of MCP research.
A PolicyLayer grant exposes only the tools you allow. Ungranted definitions never enter your context window, and every call that does run is checked against policy first.
Free to start. No card required.
4,600+ MCP servers and 31,000+ tools scanned and risk-classified.