What is LLM Router?
An LLM router is a system that intelligently directs AI requests to different models based on task complexity, cost, latency requirements, or domain — optimizing the quality-cost tradeoff across a portfolio of models.
WHY IT MATTERS
Not every request needs GPT-4. An LLM router classifies incoming requests and routes simple tasks to cheaper, faster models (Haiku, GPT-3.5) while sending complex tasks to frontier models (Opus, GPT-4). This can reduce costs by 50-80% with minimal quality impact.
Routing strategies include: classifier-based (a small model predicts difficulty), cascade (try cheap model first, escalate if quality is low), and rule-based (route by task type or keyword).
For agent systems, routing is particularly valuable. Tool calls, simple classifications, and formatting tasks don't need expensive models. Planning, complex reasoning, and financial decisions do.