Home / Token cost / Alexandria

The Alexandria MCP server costs 3,573 tokens before the first call.

Connect Alexandria and its 20 tool definitions are loaded into the model's context on every request — 1.8% of a 200k window spent before your agent does anything.

QUICK ANSWER The Alexandria MCP server's tool definitions consume 3,573 tokens — around the median MCP server (1,905 tokens). A scoped grant exposing only the tools you use cuts that roughly in proportion.

MEASURED FROM SCHEMAS 20 tools · 3,573 tokens · 1.8% of 200k · 0.4% of 1M Method →

What that buys before your agent starts working.

Tool definitions are overhead: they occupy context on every request and compete with your code, documents and conversation history for the same window.

200K WINDOW 1.8%
1M WINDOW 0.4%

Corpus context: Alexandria ranks #1215 of 3,213 measured MCP servers by definition cost. The median is 1,905 tokens, p90 is 7,952, and the heaviest (Fusionauth) is 183,337 — 92% of a 200k window on its own.

Where the 3,573 tokens go.

Each row is one tool definition as a tools/list entry — name, description and input schema — counted with o200k_base. Average: 179 tokens per tool.

ToolCategoryTokens% of server
alexandria_ask Read 331 9.3%
alexandria_debate Execute 294 8.2%
alexandria_run_agent Execute 286 8.0%
alexandria_outcome Read 268 7.5%
alexandria_generate_image Write 259 7.2%
alexandria_council Execute 246 6.9%
alexandria_grade Read 196 5.5%
alexandria_operator Execute 190 5.3%
alexandria_premortem Execute 189 5.3%
alexandria_spend Read 182 5.1%
alexandria_status Read 180 5.0%
alexandria_recommend_learnings Read 167 4.7%
alexandria_audit Read 129 3.6%
alexandria_capabilities Read 128 3.6%
alexandria_learn Read 109 3.1%
alexandria_tournament Read 107 3.0%
alexandria_doctor Execute 96 2.7%
alexandria_list_agents Read 94 2.6%
alexandria_get_tools Read 92 2.6%
alexandria_health Read 30 0.8%

Most agents use a handful of these tools. They pay for all 20.

A PolicyLayer grant exposes only the tools you allow — ungranted definitions are filtered out of the tool list, so they never enter the context window. Estimates below assume typical-weight tools (179 tokens each).

Grant scopeDefinition costReduction
All 20 tools (no gateway) 3,573 tokens
3 granted tools ~536 tokens −85%
5 granted tools ~893 tokens −75%
10 granted tools ~1,787 tokens −50%

Alexandria token-cost questions.

How many tokens does the Alexandria MCP server use?+

Its 20 tool definitions total 3,573 tokens — 1.8% of a 200k context window — measured with tiktoken o200k_base over the serialised tools/list payload. Exact counts vary slightly by client and model.

Why does Alexandria consume tokens before I send a message?+

MCP clients load every connected server's tool definitions — name, description, and input schema — into the model's context so it knows what it can call. That payload is charged against your context window on every request, whether or not a tool is used.

How do I reduce Alexandria's token usage?+

Expose fewer tools. A PolicyLayer grant scopes Alexandria to only the tools you allow — ungranted definitions are filtered out of the tool list, so they never enter the context window. A grant of 3 typical tools costs roughly 536 tokens, a 85% reduction.

Does deferred tool loading fix this?+

Partially, in some clients. Claude Code defers MCP tool schemas behind a tool-search step by default, and VS Code has experimental grouping — but you still pay tokens per search and reload, and Cursor, Windsurf and Gemini CLI load definitions upfront. Reducing the exposed tool set cuts the cost in every client.

How these numbers were measured.

01
Serialisation

Each tool is serialised as a tools/list entry — name, description, input schema — from the schemas in the PolicyLayer scan database. Clients differ slightly in framing, so treat counts as close estimates.

02
Tokeniser

tiktoken o200k_base (GPT-4o/o-series). Anthropic's current tokeniser isn't published, so Claude's exact counts will differ; for English text and JSON schemas the totals are close enough to treat these as estimates.

03
Deferred loading

Some clients now defer schema loading (Claude Code's tool search; VS Code experimental grouping). You still pay per search and reload — and Cursor, Windsurf and Gemini CLI load everything upfront.

Computed 07-06-2026 from the PolicyLayer scan database over all 20 catalogued Alexandria tools. Counts refresh with every site build.

Expose only the tools you use — the rest never enter your context.

A PolicyLayer grant scopes Alexandria to the tools you actually allow. Ungranted definitions never load, and every call that does run is checked against policy first.

Free to start. No card required.

4,600+ MCP servers and 31,000+ tools scanned and risk-classified.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.