// GLOSSARY -- AGENTIC AI

What is Context Engineering?

2 min read Updated Jun 11, 2026

Context engineering is the discipline of deciding what enters an AI agent's context window at each step — tool definitions, retrieved documents, memory, message history, and system instructions — and how that content is selected, ordered, and compacted. It is the successor framing to prompt engineering, treating context as a finite budget to be managed rather than a single prompt to be written.

WHY IT MATTERS

Prompt engineering optimised one string. Context engineering manages an entire pipeline: which tool definitions are loaded, what gets retrieved, what an agent remembers between turns, and when older history is summarised or dropped. As agents run longer loops with more tools attached, this curation determines both quality and cost.

The context window is a finite resource, and everything competes for it:

Tool schemas — every connected MCP server injects its tool definitions before the conversation starts, which is where tool sprawl bites.
Retrieval — documents and search results pulled in just-in-time rather than pre-loaded.
Memory and compaction — persisting facts outside the window and summarising stale history to reclaim space.
Delegation — handing self-contained work to a subagent so its intermediate output never pollutes the parent's context.

Poor context engineering shows up as degraded instruction-following, ignored tools, and inflated per-request token bills. It also has a security dimension: everything that enters the window can influence behaviour, so curating context overlaps with defending against context poisoning.

HOW POLICYLAYER USES THIS

Tool definitions are one of the largest fixed costs in an agent's context, and they arrive from MCP servers the moment a client connects. PolicyLayer's token-cost catalogue measures what each MCP server's tool schemas consume, and the gateway lets teams expose only the servers and tools a given person actually needs — trimming the context every session pays for before work begins.

See how PolicyLayer governs agent tool calls →

IN THE CATALOGUE

Measured across 5,494 MCP servers (117,877 tools): connecting a server loads its full tool definitions into the context window on every request.

2,149 tokens — median server

11,752 tokens — 90th percentile

183,480 tokens — largest measured (Ainumbers Mcp Apps)

Server	Tool definitions	Tokens per request
GitHub	86	14,406
Linear	66	7,149
Supabase	29	2,561
Filesystem	14	1,666

Look up any server’s token cost →

FREQUENTLY ASKED QUESTIONS

How is context engineering different from prompt engineering?

Prompt engineering crafts a single instruction string. Context engineering manages everything the model sees across a long-running agent loop — tools, retrieval, memory, history, and compaction — as a budget to be allocated.

Why do MCP servers matter for context engineering?

Each connected MCP server injects its tool definitions into the context window before any work happens. Connecting many servers can consume tens of thousands of tokens per session on schemas alone.

What is compaction?

Compaction summarises or prunes older conversation history so an agent can continue a long task without exceeding the context window, keeping the salient facts and discarding the rest.

What is Context Engineering?

WHY IT MATTERS

HOW POLICYLAYER USES THIS

IN THE CATALOGUE

FREQUENTLY ASKED QUESTIONS

FURTHER READING

Take your agents live. Without losing control.

What is Context Engineering?

WHY IT MATTERS

HOW POLICYLAYER USES THIS

IN THE CATALOGUE

FREQUENTLY ASKED QUESTIONS

RELATED TERMS

FURTHER READING

Take your agents live. Without losing control.