// PolicyLayer Research · July 2026 edition

The State of MCP Security

Name: PolicyLayer MCP Server Catalogue
Creator: PolicyLayer
Published: 2026-07-01

What 32,820 MCP servers can actually do to your systems.

We classified every tool on every Model Context Protocol server we could enumerate from the public registries — 517,973 tools across 32,820 working servers. The data shows an ecosystem that hands AI agents wide, dangerous, and almost entirely unannounced control over the systems they touch.

43%

of MCP servers expose a tool that destroys data or executes commands

94%

probability that a five-server stack exposes such a tool. 99% at ten.

96.4%

of MCP tools don't warn the agent about destructive behaviour

59%

of MCP servers that touch money also expose tools that destroy data

Dataset snapshot 1 July 2026 · updated monthly. Methodology in the appendix.

// WHAT CHANGED THIS MONTH

Ten things moved meaningfully since last month.

Dataset size

+1516%

Servers with a parseable tool list grew from 2,031 to 32,820.

New entrant

Top 10

io.github.abl030/pfsense-mcp entered the ten riskiest servers (risk score 227.2).

New entrant

Top 10

Yaver entered the ten riskiest servers (risk score 218.19).

New entrant

Top 10

UnClick entered the ten riskiest servers (risk score 186.38).

New entrant

Top 10

Pentester-MCP entered the ten riskiest servers (risk score 164.8).

New entrant

Top 10

Binance MCP Server entered the ten riskiest servers (risk score 161.56).

New entrant

Top 10

io.github.HomenShum/nodebench entered the ten riskiest servers (risk score 152.55).

New entrant

Top 10

GoHighLevel MCP Server entered the ten riskiest servers (risk score 146.44).

New entrant

Top 10

CloudStack MCP Server entered the ten riskiest servers (risk score 142.09).

New entrant

Top 10

io.github.dearlordylord/huly-mcp entered the ten riskiest servers (risk score 126.41).

Risk score is a server's tool count multiplied by the average risk weight of its tools — it climbs with both breadth and danger, so a server ranks highly only when it exposes many tools and those tools skew destructive. A change is flagged here when destructive share shifts ±3 points, the dataset size shifts ±10%, any named server's risk score shifts ±10%, or a server newly enters the ten riskiest.

1. What MCP servers actually do

Every tool we found was classified into one of six risk categories: read-only, write, execute, destructive, financial, or other. The chart below shows how many of the 32,820 servers in our dataset expose at least one tool in each category.

Read

29,986 (91.4%)

Write

18,997 (57.9%)

Destructive

9,212 (28.1%)

Execute

8,588 (26.2%)

Other

1,576 (4.8%)

Financial

783 (2.4%)

Servers usually expose tools in multiple categories — an integration that lists, creates, and deletes records lands in Read, Write and Destructive simultaneously. Percentages are of the 32,820 servers in the dataset.

2. One in four MCP servers can permanently destroy data

9,212 servers (28.1%) expose at least one destructive tool — deleting records, dropping tables, wiping indexes, force-pushing branches, removing cloud resources. These are operations that a human operator would normally guard with a confirmation dialog or a four-eyes review. When invoked through MCP, they fire on the model's first decision.

Another 8,588 servers (26.2%) can execute arbitrary commands — shell, scripts, container exec, SQL with no read-only enforcement. Combine the two: roughly four in ten MCP servers give an agent a way to do something it cannot easily undo.

9,212

MCP servers ship a delete-first tool. Most don't ask twice.

Exposure compounds with every server you add

43.28% of MCP servers expose a destructive or execute tool on their own. Stacking servers is the common case — an agent rarely connects to one. If the per-server rate holds independently, the probability that a stack of N servers exposes at least one such tool is 1 − (1 − 0.43)^N. It passes 94.1% by the fifth server and 99.7% by the tenth.

N = 1

43.3%

N = 2

67.8%

N = 3

81.8%

N = 4

89.6%

N = 5

94.1%

N = 6

96.7%

N = 7

98.1%

N = 8

98.9%

N = 9

99.4%

N = 10

99.7%

N = 11

99.8%

N = 12

99.9%

N = 13

99.9%

N = 14

100%

Independence is an approximation — tool overlap between servers makes the true figure slightly lower — but the direction holds: multi-server exposure is the default, not the tail.

The single most common destructive verb across the dataset is delete: it appears as the first token of 9,542 tool names. create, update, and delete together form the standard CRUD trio that virtually every "integration" MCP server ships. The protocol provides no separation between them.

3. The average MCP install gives an AI agent 15.8 tools

The median MCP server exposes 7 tools. The mean is 15.8. The 99th percentile exposes 138. The fattest single server is io-github-homenshum-nodebench, which exposes 824 tools to any agent that connects. The full classifier output for every server we scanned is in our public MCP tool catalogue.

Server	Tools
io.github.HomenShum/nodebench	824
UnClick	813
Yaver	775
Binance MCP Server	734
io.github.abl030/pfsense-mcp	680
AdButler	622
fortimanager-mcp	584
Crow	573
Mcp	571
GoHighLevel MCP Server	566

A server exposing 200+ tools is unauditable in practice. No human reads 200 tool descriptions before installing. The model sees them all by default, and a context-window's worth of tool schemas competes with whatever task the user actually asked for.

52×

The fattest MCP server crowds the agent's context with more than 52× as many tool schemas as the average install — before the user has typed a single character of their actual task.

4. Destructive MCP surface is concentrated in a few servers

Most MCP servers are not dangerous. The 28.1% headline number obscures a much sharper truth: destructive surface is heavily concentrated in a small minority of servers, while the long tail of CRUD-shaped integrations sits in the middle. Three out of every four MCP servers expose zero destructive tools at all.

32,820

MCP servers

Destructive tools per server

0 destructive 23,608 (71.9%)

1–2 destructive 5,684 (17.3%)

3–5 destructive 2,429 (7.4%)

6–10 destructive 671 (2%)

11+ destructive 428 (1.3%)

11.3%

of all destructive surface in the MCP ecosystem is concentrated in just 92 servers — 0.2% of the dataset. The top 10% of destructive-bearing servers (921 servers, 2.4% of the dataset) hold 40.3%.

A handful of servers carry the lion's share of the protocol's risk. They tend to be the same shape: large integrations — identity providers, project-management platforms, all-in-one cloud SDKs — that ship hundreds of CRUD endpoints with equal policy weight, and one or two of those endpoints turn out to be the kill switch.

The classifier's top 5% on its own accounts for an outsized share of the destructive calls an MCP-connected agent could make. The protocol does not surface this asymmetry to the model. The model sees a flat list of tools, with no hint that some are load-bearing for the host system and some are not.

The named riskiest ten

Ranked by risk score — tool count weighted by average per-tool risk. These are the servers carrying the most concentrated destructive surface in the dataset.

#	Server	Tools	Destructive	Risk score
1	io.github.abl030/pfsense-mcp	680	177	227.2
2	Yaver	775	37	218.19
3	UnClick	813	10	186.38
4	AdButler	622	105	172.54
5	Pentester-MCP	337	0	164.8
6	Binance MCP Server	734	48	161.56
7	io.github.HomenShum/nodebench	824	12	152.55
8	GoHighLevel MCP Server	566	82	146.44
9	CloudStack MCP Server	442	82	142.09
10	io.github.dearlordylord/huly-mcp	470	73	126.41

5. When MCP servers touch money, most can also destroy data

Only 783 MCP servers in our dataset expose financial tools — payments, transfers, wallet operations. They are rare. They are also the cohort with the highest combined risk in the entire ecosystem.

58.5%

of MCP servers that touch money also expose destructive tools.
458 of 783 servers

75.1%

also expose destructive or arbitrary-execute tools.
588 of 783 servers

Across the 458 servers that combine financial and destructive surface, an agent connecting to a single one of them gets, on average, 6.8 ways to destroy data and 2.4 ways to move money. The single worst dual-risk server gives the agent 82 destructive tools and 47 financial tools — 129 ways to either break things irreversibly or move money — in a single MCP install.

75.1%

of MCP servers that touch money also let an agent either destroy data or run a command on the host. The model has to pick the right tool, every time, from descriptions that 96.4% of the time don't warn it about consequences.

The named dual-risk servers

Every server below exposes both financial and destructive tools in a single install. An agent connecting to one of them can move money and delete records without changing context.

Server	Tools	Destructive	Financial
Yaver	775	37	1
Binance MCP Server	734	48	47
GoHighLevel MCP Server	566	82	3
GoCreative Agent API	546	7	1
Ruflo	487	25	8
Claude Flow	455	23	8
ServiceTitan MCP Server	454	28	4
CloudStack MCP Server	442	82	1
Serac	432	6	1
Wpm Mcp Server	420	13	4
Integrations MCP	420	9	1
GoHighLevel MCP Server	406	61	4
TheProtocol — Sovereign AI Agent Platform	393	23	6
Tencent Ad MCP Server	357	24	4
io.github.aibtcdev/mcp-server	353	16	17
ebay-mcp	332	37	4
Bybit MCP Server	326	14	1
MCP Dynamics 365 Commerce Server	326	12	13
Unity MCP Server	324	31	1
shopify-graphql-mcp	319	45	5
20i MCP Server	302	14	3
AIquila — Nextcloud MCP Server	295	50	1
Sanka MCP Server	284	36	2
Ncp	276	8	2
Tenzro Ledger MCP	275	6	7

5.5 Deep dive: the Stripe MCP

Stripe's MCP server exposes 39 tools to any agent that connects. 13% of them are classified destructive and 13% touch money directly. Ranked by risk weight, the three highest are create_payment_intent, create_refund, full_refund. One MCP install hands all of them to the model as a flat list.

What it can move

5 of Stripe's tools are financial — the calls that move balances, charges, refunds, payouts, and transfers. An agent with the server connected can invoke any of them directly, with whatever arguments it infers from the request. In policy terms these are the operations that take money out of the account.

create_payment_intentcreate_refundfull_refundhigh_refund_ratiopay_invoice

What it can destroy

5 tools are classified destructive — deletes, cancellations, and voids that the same API cannot reverse. A further 1 can execute. None of them carry warning language the model reads before calling; the category is inferred from the verb in the tool name, not declared by the server.

archive_customercancel_payment_intentcancel_subscriptiondelete_customerpurge_expired_customers

What a deny-by-default policy looks like

A deny-by-default posture starts every Stripe tool denied and allows back only the read paths an agent needs — listing charges, retrieving a customer, reading a balance. The 10 destructive and financial tools stay denied unless a policy grants them explicitly, and the ones that are granted route through an approval gate rather than firing on the model's first decision. A worked example is published at policylayer.com/policies/stripe.

Stripe's MCP is well-built; the point is not that it is unusually dangerous. The point is that the server cannot know which agent should be allowed to issue a refund. That decision belongs to the control plane in front of it.

5.6 Some MCP servers expose no read-only tools

670 servers (2.5% of the 27,105 servers with three or more tools) expose no read-only tool — every tool they ship mutates state. You cannot connect such a server in observe-only mode; installing it grants write access or worse from the first call.

Server	Tools	Destructive	Execute
Turf-MCP	113	1	110
Mulmocast Vision	83	0	0
crypto-indicators-mcp	78	2	0
io.github.daedalus/mcp-numpy	72	0	0
DaVinci Resolve MCP Server	53	0	0
Next Finance	51	1	0
FontLab MCP Server	38	7	1
Isparta Uni OBS MCP Server	34	0	3
io.github.MusaddiqueHussainLabs/mhlabs_mcp_tools	31	10	1
io.github.antvis/mcp-server-chart	27	0	0

6. Official MCP registries are not noticeably safer

A common assumption is that "official" MCP listings are curated and therefore safer. The data does not support it. Average risk weight per tool barely moves between sources, and seed-listed servers (those originally added by hand to bootstrap the ecosystem) are actually the highest-risk cohort.

Source	Servers	Tools	Avg risk	% destructive	% execute
`glama`	29,953	314,332	0.205	5.8%	4.5%
`registry`	8,924	108,165	0.214	6.8%	3.5%
`crawler`	4,278	64,384	0.21	5.8%	5%
`discovery`	2,415	9,262	0.213	5.5%	4.9%
`smithery`	966	12,694	0.194	3.7%	3.7%
`seed`	336	7,853	0.239	6%	5.9%
`user_scan`	82	190	0.224	8.9%	3.2%

Every registry leaves risk evaluation to the developer installing the server. None of them gate on tool category, parameter danger, or the presence of unconfirmed write paths. Listing is curation only by name.

7. Two of the six most common MCP verbs are destructive

The MCP ecosystem speaks one language: CRUD. Across 517,973 tools, the four most common verbs after get and list are create, search, update, and delete. Two of the top six are mutations the model cannot undo. The protocol provides no separation between any of them.

get_* 84,622

list_* 30,423

create_* 16,764

search_* 14,140

update_* 9,644

delete_* 9,542

add_* 5,654

generate_* 5,157

set_* 4,763

polymarket_* 3,963

check_* 3,826

find_* 2,943

delete_* appears 9,542 times. That is roughly one destructive-named tool for every five servers in the dataset, before counting tools that are destructive without using the word ("drop", "remove", "wipe", "purge"). Verb shape is the cheapest signal a client could act on; nothing in MCP requires clients to use it, so they don't.

8. MCP tools don't brief the agent. 96.4% give no warning at all.

MCP tool descriptions go directly into the model's context as the only briefing it gets. We searched all 517,973 classified tool descriptions for warning language — "irreversible", "permanent", "cannot be undone", "destroys", "wipes", "deletes", "drops", "purges". Only 18,790 tools (3.6%) contain any of those phrases.

The other 96.4% rely on the model inferring danger from the verb in the tool name. For a request like "clean up duplicate rows", an agent given fifty CRUD tools and no warnings will pick the one whose name matches the verb. delete_rows is the obvious match. There is no semantic signal that distinguishes it from list_rows.

A further 1.7% of servers (543) accept parameters whose names imply filesystem paths or shell command strings — path, filename, command, script, exec, stdin. These tools provide direct write or execution surfaces against the host the server runs on, regardless of whether they are classified as destructive. The wider catalogue of documented MCP attack patterns shows how prompt injection, tool poisoning, and supply-chain compromise convert these surfaces into incidents.

9. The trust boundary is the developer's restraint

The MCP specification ships with no built-in authorisation, no rate limits, no spend caps, and no audit trail. Servers expose whatever their authors decided to expose, in whatever shape, with whatever description. Clients pass tool lists to models with no enforced filter. Models call tools with whatever arguments they think appropriate. The trust boundary is the developer's restraint when they write the server.

This dataset puts numbers on the consequences:

One in four MCP servers can delete or destroy data.
One in four can execute arbitrary commands on its host.
The average install hands the agent 15.8 tools, often more than 30.
3.6% of tools warn the model about what they do. The other 96.4% don't.
Official, semi-official and community registries show no meaningful risk gap.

Most teams would not ship an internal API where every endpoint is unauthenticated and uncategorised, where 1 in 4 endpoints can delete production data, and where 96.4% of endpoints have no documentation about side effects. That is the median MCP server today. Whether your agent runs on it is a control-plane decision, not a server-author decision.

The fix is not to ban destructive tools. The fix is enforcement at the transport layer: every tool call evaluated against a deterministic policy before it reaches the server. For the broader picture of how the protocol breaks under production conditions, see the canonical MCP security overview.

PolicyLayer is the MCP control plane:

A gateway in front of every MCP server in your fleet, with managed OAuth that holds and refreshes upstream tokens transparently.
A policy editor that discovers each server's tools so you can gate by category — destructive, financial, execute — instead of by tool name.
Scoped per-agent grants, decoupled from user identity, so revoking one agent doesn't break the rest of your stack.
A per-call audit log keyed to the grant that made the call, with full arguments, outcome, and latency.

One install. Every server. Scan your config to get the same picture this report shows, but for your stack — in 30 seconds.

Methodology

PolicyLayer maintains a continuously-updated catalogue of MCP servers harvested from the official Model Context Protocol registry, npm, Smithery, and Glama. For each server we attempt to extract its tool list through one of three paths:

Static analysis — grep the published npm tarball for tool definitions.
README extraction — parse README for tool tables and code blocks.
Live execution — spawn the server via npx in a sandboxed container and read its tools/list response.

The 32,820 servers in this report are those for which at least one path produced a parseable tool list. Tools are classified into six risk categories (Read, Write, Execute, Destructive, Financial, Other) using a verb-based classifier with input-schema heuristics. 76.1% of tool classifications are marked high-confidence, 0.3% verified.

Risk weights are floats from 0.0 (read-only) to 1.0 (destructive financial). A server's risk score is its tool count multiplied by the average risk weight of its tools, so a server scores highly only when it exposes many tools and those tools skew dangerous — it is a measure of total exposed surface, not per-tool severity. The full classified catalogue — one row per server, one row per tool — is published as an open dataset on Hugging Face under CC-BY-4.0: huggingface.co/datasets/PolicyLayer/mcp-server-catalogue. Loadable via load_dataset("PolicyLayer/mcp-server-catalogue"). Methodology questions or custom cuts: research@policylayer.com.

Limitations. The dataset only covers servers reachable through public registries; private and self-hosted servers are not included. Tool-level classification can mislabel ambiguous verbs ("update" can be safe or destructive depending on parameters); the confidence breakdown above surfaces these. Some registry-listed servers were unreachable through our scan pipeline and are excluded from the figures here; the dataset is therefore a lower bound on the real ecosystem.

Take your agents live. Without losing control.

Route your MCP traffic through PolicyLayer. Every tool call is checked against your policy before it runs: allow, deny, or require approval. Per-identity grants. Full audit log. Live in minutes.

TAKE YOUR AGENTS LIVE →

Instant setup, no code required.

46,500+ MCP servers and 515,000+ tools scanned and risk-classified.