What is Context Engineering?

2 min read Updated

Context engineering is the discipline of deciding what enters an AI agent's context window at each step — tool definitions, retrieved documents, memory, message history, and system instructions — and how that content is selected, ordered, and compacted. It is the successor framing to prompt engineering, treating context as a finite budget to be managed rather than a single prompt to be written.

WHY IT MATTERS

Prompt engineering optimised one string. Context engineering manages an entire pipeline: which tool definitions are loaded, what gets retrieved, what an agent remembers between turns, and when older history is summarised or dropped. As agents run longer loops with more tools attached, this curation determines both quality and cost.

The context window is a finite resource, and everything competes for it:

  • Tool schemas — every connected MCP server injects its tool definitions before the conversation starts, which is where tool sprawl bites.
  • Retrieval — documents and search results pulled in just-in-time rather than pre-loaded.
  • Memory and compaction — persisting facts outside the window and summarising stale history to reclaim space.
  • Delegation — handing self-contained work to a subagent so its intermediate output never pollutes the parent's context.

Poor context engineering shows up as degraded instruction-following, ignored tools, and inflated per-request token bills. It also has a security dimension: everything that enters the window can influence behaviour, so curating context overlaps with defending against context poisoning.

Running agents against MCP servers? Route them through PolicyLayer and every tool call is checked against policy first.

PUT POLICY ON YOUR TOOL CALLS →

Enforced before the call runs. Nothing to install.

HOW POLICYLAYER USES THIS

Tool definitions are one of the largest fixed costs in an agent's context, and they arrive from MCP servers the moment a client connects. PolicyLayer's token-cost catalogue measures what each MCP server's tool schemas consume, and the gateway lets teams expose only the servers and tools a given person actually needs — trimming the context every session pays for before work begins.

IN THE CATALOGUE

Measured across 3,105 MCP servers (56,764 tools): connecting a server loads its full tool definitions into the context window on every request.

1,860 tokens — median server
7,924 tokens — 90th percentile
183,337 tokens — largest measured (Fusionauth)
ServerTool definitionsTokens per request
GitHub8614,406
Linear667,149
Supabase292,561
Filesystem141,642

FREQUENTLY ASKED QUESTIONS

How is context engineering different from prompt engineering?
Prompt engineering crafts a single instruction string. Context engineering manages everything the model sees across a long-running agent loop — tools, retrieval, memory, history, and compaction — as a budget to be allocated.
Why do MCP servers matter for context engineering?
Each connected MCP server injects its tool definitions into the context window before any work happens. Connecting many servers can consume tens of thousands of tokens per session on schemas alone.
What is compaction?
Compaction summarises or prunes older conversation history so an agent can continue a long task without exceeding the context window, keeping the salient facts and discarding the rest.

FURTHER READING

Let agents act without letting them run wild.

Route your MCP servers through PolicyLayer and every tool call is checked against your policy before it runs — allow, deny, or require approval. Per-identity grants. Full audit log. Live in minutes.

Free to start. No card required.

43,000+ MCP servers and 220,000+ tools scanned and risk-classified.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.