Security Reference

OWASP LLM Top 10, mapped to MCP tools

Updated 23 June 2026 By PolicyLayer Research

The OWASP Top 10 for LLM Applications is the reference taxonomy for how AI applications fail. Most of it is written for the model and the prompt. This page maps it onto the part operators actually wire up: the MCP tools an agent can call.

Three of the ten classes are detectable directly from a tool's surface, so we measure them across the live catalogue — 254,056 tools across 12,020 servers. Each one terminates in a deterministic policy you can enforce at the gateway. The rest are bounded at call time, not from a static attribute.

Detected on the tool surface

These classes are visible in a tool's name, description, and argument schema. The classifier flags them on every catalogued tool, and each maps to a suggested rule — scoring is the easy half; the policy is what bounds the risk.

LLM02: Sensitive Information Disclosure

1,725 tools · 272 servers

A tool accepts a credential, token, or connection string as an argument, so secrets pass through the agent context and into a third-party server.

In the catalogue

email_delete UnClick Require approval
abn_lookup UnClick Restrict
abn_search UnClick Restrict
abuseipdb_blacklist UnClick Restrict
claim_display agentView Restrict
clear_display agentView Require approval

Deterministic guardrail

Redact credential parameters before they are forwarded; never let a tool receive an unscoped secret.

LLM05: Improper Output Handling

5,754 tools · 1,956 servers

A tool takes freeform code, SQL, or raw markup as an argument, so the agent can hand arbitrary payloads straight to an interpreter or renderer.

In the catalogue

agent-account-intel-pack Apiosk Require approval
agent-competitor-grid Apiosk Require approval
agent-email-verifier Apiosk Require approval
agent-file-converter Apiosk Restrict
ops Yaver Require approval
short_delete Yaver Require approval

Deterministic guardrail

Reject freeform input on these parameters; allowlist the permitted commands or templates rather than accepting arbitrary strings.

LLM06: Excessive Agency

44,410 tools · 7,167 servers

Tools that execute code, delete data, move money, or act in bulk are exposed to the agent with no per-call bound — the single largest class on the MCP surface.

In the catalogue

dnscat_client Pentester-MCP Require approval
execute_weevely_module Pentester-MCP Require approval
identify_hash Pentester-MCP Require approval
john_prepare Pentester-MCP Require approval
delete_appointment GoHighLevel MCP Server Require approval
delete_contact GoHighLevel MCP Server Require approval

Deterministic guardrail

Require approval on high-consequence tools, cap the value of the call (amount, scope, target), and restrict bulk operations to single targets.

LLM03: Supply Chain

Server level · identity & drift

A server is typosquatted, rug-pulled, or quietly changes its tool definitions after earning approval — so the tools an agent calls are no longer the tools that were vetted.

Deterministic guardrail

Pin each server’s identity to its distribution artifact, not its name, and track tool definitions for drift so a rug-pull can’t change what a tool does after approval.

Bounded at runtime

The remaining classes are not a static property of any one tool — they emerge at call time, from injected input, composition, or volume. There is nothing to catalogue; there is something to enforce. The gateway is where each is contained.

LLM01: Prompt Injection

Injected instructions arrive through tool results — an issue body, a web page, a database row — and the agent acts on them as if they were a user request.

Cannot be prevented in the prompt; it is contained at the gateway. Default-deny high-consequence tools, validate arguments before forwarding, and pin tool schemas so an injected call still hits a deterministic rule.

LLM04: Data and Model Poisoning

Poisoned data re-enters the agent through a tool result and shapes later tool calls.

Sanitise tool output before it re-enters agent context, and bound what the agent can do with it through default-deny policy.

LLM07: System Prompt Leakage

Rules placed in the prompt are recoverable by a capable agent; they are not an enforcement boundary.

Keep authorisation out of the prompt entirely. Enforce at the transport layer where the agent cannot read or reason around the rules.

LLM08: Vector and Embedding Weaknesses

Relevant where an MCP server fronts a vector store; the exposure is the read/write tools it exposes over that store.

Classify and bound the store’s read/write tools like any other; scope retrieval to the calling principal.

LLM09: Misinformation

An agent acts on a fabricated tool argument or result without a deterministic check.

Make consequential calls verifiable: deterministic argument constraints and human approval, not probabilistic confidence.

LLM10: Unbounded Consumption

A loop calls a tool thousands of times, or a single call fans out across every target.

Rate-limit per tool and per session, and block unbounded bulk operations at the gateway.

Scoring is the easy half

Rating a tool against a framework is table stakes. What changes the outcome is the next step: a deterministic rule that bounds the call — cap the amount, allowlist the path, require approval, block the bulk operation. A classifier can tell you a tool moves money; only the gateway can guarantee it never moves more than you allowed, on every call, with an audit trail an assessor can sign off. That is the difference between a score and enforcement.

Detected on the tool surface

LLM02: Sensitive Information Disclosure

LLM05: Improper Output Handling

LLM06: Excessive Agency

LLM03: Supply Chain

Bounded at runtime

LLM01: Prompt Injection

LLM04: Data and Model Poisoning

LLM07: System Prompt Leakage

LLM08: Vector and Embedding Weaknesses

LLM09: Misinformation

LLM10: Unbounded Consumption

Scoring is the easy half

Score the tool. Then bound it.