← Attack Database

Backdoored community MCP server

Supply chain verified

Backdoored community MCP server

Summary

A backdoored community MCP server is one that is intentionally malicious from the outset — published to npm, PyPI, a third-party MCP registry (mcp.so, Smithery, Glama), or a skills/extensions marketplace, specifically to be installed by AI agent users. Unlike a compromised package, where a legitimate project is hijacked, a backdoored community server is author-malicious from day one. The unofficial MCP ecosystem indexes 16,000+ servers across registries with no identity verification, little review, and no required signing — making this the cheapest supply-chain attack in the stack: publish, wait, exploit.

How it works

  1. Attacker publishes an MCP server with an attractive README — “fast Postgres helper”, “AI-native Slack bot”, “Notion-to-Obsidian sync”. Optional: clone the code of a legitimate server and add the backdoor.
  2. The package is listed on one or more MCP registries. Registries typically accept submissions with no code audit and no publisher identity binding.
  3. A developer finds the server via registry search, README keyword, or LLM recommendation (agents suggest MCP servers by name from their training data) and installs it.
  4. The backdoor fires — typically on one of three triggers:
    • Install-time: postinstall/preinstall scripts exfiltrate environment variables, cloud credential files, SSH keys, and npm/GitHub tokens before any MCP handshake. TruffleHog-style secret scanning is common.
    • First call: The MCP server caches ~/.aws/credentials, .env, shell history on first tool invocation and POSTs to an attacker endpoint.
    • Tool invocation: Tools perform their advertised function but also silently siphon arguments/results (emails BCC’d, DB queries logged, files uploaded).
  5. Stolen credentials are sold or used for further campaigns; tool arguments provide continuous exfiltration.

Kaspersky’s Securelist team published a full PoC following this exact chain: a slick-looking PyPI MCP server whose README advertises useful features, registered in a client config by the victim developer, which on first call caches credential files and environment variables, then POSTs them to the attacker’s API.

Real-world example

postmark-mcp (September 2025)

The clearest named in-the-wild case. The npm package postmark-mcp, published by phanpak, was backdoored from the point of publication — it was never the legitimate Postmark project. See the compromised package page for full details; it also qualifies as a backdoored community server under this taxonomy because the author was malicious from day one, the package was hosted on a public registry (npm), and it reached victims through normal community discovery. Version 1.0.16 (17 September 2025) BCC’d every outgoing email to phan@giftshop.club.

Kaspersky Securelist PoC (2025)

Kaspersky’s threat-research team published a reproducible proof-of-concept demonstrating how a backdoored community MCP server on PyPI can weaponise the normal install-and-register flow. They confirmed the vector as live and abusable; they did not name a specific victim beyond the PoC. Source: “Malicious MCP servers used in supply chain attacks”, Securelist.

VirusTotal audit (June 2025) — 17,845 MCP GitHub repos

VirusTotal harvested 17,845 likely MCP server repositories from GitHub and ran Code Insight analysis on each. Roughly 8% (~1,400 projects) were flagged as potentially malicious or containing serious vulnerabilities. VirusTotal caveat that many flags were sloppy “hello world” examples or intentional research PoCs rather than confirmed in-the-wild attacks — but the scale of the flagged set (in the low thousands) establishes that low-reputation MCP servers on public registries should be assumed hostile by default.

AgentSeal scan (2025) — 1,808 MCP servers

AgentSeal scanned 1,808 MCP servers from public sources and found 66% had security findings. Astrix’s 2025 State of MCP Server Security report separately found that 88% of MCP servers require credentials and 53% rely on long-lived static secrets (API keys, PATs), versus just 8.5% using OAuth — meaning a backdoored server is typically handed a long-lived credential on install.

OpenClaw ClawHub skills marketplace (January–February 2026) — adjacent ecosystem

In January 2026 the AI agent platform OpenClaw (180,000+ GitHub stars) had ~1,000 publicly exposed instances grow to 21,000+ within a week (Censys, 25–31 January 2026), many running without authentication and leaking Anthropic API keys, Telegram tokens, Slack sessions, and months of chat history. On 13 February 2026 researchers disclosed (GitHub issue #16052) that 341 skills on ClawHub — the community skills marketplace — were compromised in a coordinated supply-chain attack primarily delivering Atomic macOS Stealer (AMOS); later scans reported 800+ malicious skills (~20% of the registry). This is not an MCP server compromise, but it is the clearest 2026 datapoint for how a community “tools/skills” registry with no identity verification degrades when it reaches scale — exactly the failure mode the MCP community registries are heading into. See The Register, Kaspersky, Conscia, and CVE-2026-25253 (NVD).

Impact

  • Credential theft at scale. Any secret reachable from the MCP server process — which is everything the user can reach.
  • Recurring exfiltration. Unlike a one-shot steal, a backdoored MCP server stays running as long as the developer uses it, providing continuous access to every tool call.
  • Model manipulation. Tool outputs can be poisoned to steer the agent into further attacker-controlled actions.
  • Reputation laundering. Legitimate-looking READMEs and GitHub stars (often bought) make backdoored servers hard to distinguish from safe ones.
  • Registry-wide risk. With 16,000+ servers across unofficial registries and no identity binding, the attack surface is expanding faster than any review process.

Detection

  • Prefer first-party servers. Only install MCP servers published by the vendor whose API they wrap (official Stripe, GitHub, Postgres, etc.). Treat third-party wrappers of those APIs as hostile by default.
  • Verify publisher identity. Cross-check the npm/PyPI publisher against the vendor’s own documentation. phanpak did not match Postmark.
  • Read the code. MCP servers are small (typically a few hundred lines). Diff against the upstream repo if one exists.
  • Disable postinstall scripts (npm install --ignore-scripts) and run servers in a sandbox.
  • Egress monitoring. A backdoored server’s first tell is usually an outbound connection to a domain unrelated to the service it wraps.
  • Scan with MCP-specific tools. mcp-scan (Invariant Labs), mcpshield, and Socket.dev can flag known-malicious packages and suspicious patterns.
  • Audit tools/list before approval. If a server advertises a tool unrelated to its stated purpose (a “Postmark” server exposing read_file), reject it.

Prevention

Transport-layer policy enforcement at the MCP boundary cannot stop install-time backdoors (those execute before the proxy runs), but it dramatically reduces what a backdoored server can do once running. Combine with install-time sandboxing for full coverage.

Intercept’s primary levers against a backdoored community server:

  1. default: deny allowlist — only the tools you listed at policy-write time are callable. New tools the backdoor adds cannot be invoked.
  2. hide sensitive tools — strip tools with destructive or credential-reaching capability from tools/list so the model cannot be prompt-injected into calling them.
  3. Argument conditions — lock tool arguments to safe shapes (paths under a prefix, recipients on a domain, amounts under a cap), so a malicious implementation cannot coerce the tool into exfiltration inputs.
  4. require_approval on anything that writes, deletes, sends, or spends.
  5. Rate limits and spend caps — bound the damage of any single session.

Example — a defensive policy for a low-trust third-party Postgres MCP server:

version: "1"
description: "Third-party Postgres MCP — minimal privilege"
default: deny

hide:
  - execute_sql          # too broad; removed from tools/list entirely
  - drop_table
  - truncate_table

tools:
  list_tables:
    rules:
      - name: "hourly list limit"
        rate_limit: 30/hour

  describe_table:
    rules:
      - name: "approved schemas only"
        conditions:
          - path: "args.schema"
            op: "in"
            value: ["public", "reporting"]
        on_deny: "Schema not on approved list"

  select_rows:
    rules:
      - name: "row cap"
        conditions:
          - path: "args.limit"
            op: "lte"
            value: 100
        on_deny: "Queries limited to 100 rows"

      - name: "no PII tables"
        conditions:
          - path: "args.table"
            op: "not_in"
            value: ["users", "payments", "pii_audit"]
        on_deny: "That table is not accessible via AI agents"

  "*":
    rules:
      - name: "global rate cap"
        rate_limit: 60/minute

Defence-in-depth beyond Intercept:

  • Pin exact versions; never @latest.
  • Disable install scripts (npm install --ignore-scripts, pip install --no-build-isolation where appropriate).
  • Run each MCP server in a dedicated container with a minimum-privilege service account and no access to the developer’s real credentials.
  • Use a secret-scoped wrapper (short-lived tokens issued per session) rather than handing the MCP server your root API key.

Drift detection (roadmap, speculative). A future Intercept feature would hash tool manifests on first connection and alert on any change, mirroring the ETDI proposal (arXiv:2506.01333). Not yet implemented — use allowlisting in the meantime.

Sources

Protect your agent in 30 seconds

Scans your MCP config and generates enforcement policies for every server.

npx -y @policylayer/intercept init
github.com/policylayer/intercept →
// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.