What is Context Window?

1 min read Updated

A context window is the maximum number of tokens an LLM can process in a single interaction, encompassing system prompt, conversation history, retrieved documents, and generated output.

WHY IT MATTERS

The context window is the LLM's working memory. Everything the model considers must fit within this window. Modern models range from 8K tokens to 200K+ tokens (Claude, Gemini).

For agents, context window management is critical. A long-running agent accumulates history that gradually fills the window. When it overflows, the oldest context is dropped — potentially losing critical instructions.

Larger windows help but don't fully solve the problem. Research shows LLMs struggle with information in the middle of long contexts ('lost in the middle'), making placement of critical instructions matter.

FREQUENTLY ASKED QUESTIONS

What happens when the context window fills up?
Older content is truncated or summarized. This can cause the agent to lose earlier instructions or safety constraints — a real risk for long-running financial agents.
Does a larger context window mean better performance?
Not necessarily. Models can struggle to attend to all information equally. The 'lost in the middle' effect means information placement matters as much as window size.
How does context window affect costs?
LLM pricing is per-token. Larger contexts mean more input tokens per call, directly increasing costs. Efficient context management reduces expenses.

FURTHER READING

Enforce policies on every tool call

Intercept is the open-source MCP proxy that enforces YAML policies on AI agent tool calls. No code changes needed.

npx -y @policylayer/intercept
github.com/policylayer/intercept →
// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.