Critical Risk →

clean_text

Remove HTML tags, fix encoding issues, normalize whitespace, and extract clean text from messy input. Perfect for agents processing scraped web content or user-submitted text.

Part of the Structured Data Validator MCP server. Enforce policies on this tool with Intercept, the open-source MCP proxy.

AI agents may call clean_text to permanently remove or destroy resources in Structured Data Validator. Without a policy, an autonomous agent could delete critical data in a loop with no way to undo the damage. Intercept blocks destructive tools by default and requires explicit human approval before enabling them.

Without a policy, an AI agent could call clean_text in a loop, permanently destroying resources in Structured Data Validator. There is no undo for destructive operations. Intercept blocks this tool by default and only allows it when a human explicitly approves the action.

Destructive tools permanently remove data. Block by default. Only enable with explicit approval workflows.

io-github-agenson-horrowitz-structured-data-validator.yaml
tools:
  clean_text:
    rules:
      - action: deny
        reason: "Blocked by default — enable with approval"

See the full Structured Data Validator policy for all 5 tools.

Tool Name clean_text
Category Destructive
Risk Level Critical

Agents calling destructive-class tools like clean_text have been implicated in these attack patterns. Read the full case and prevention policy for each:

Browse the full MCP Attack Database →

Other tools in the Destructive risk category across the catalogue. The same policy patterns (deny, require_approval) apply to each.

clean_text is one of the critical-risk operations in Structured Data Validator. For the full severity-focused view — only the critical-risk tools with their recommended policies — see the breakdown for this server, or browse all critical-risk tools across every MCP server.

What does the clean_text tool do? +

Remove HTML tags, fix encoding issues, normalize whitespace, and extract clean text from messy input. Perfect for agents processing scraped web content or user-submitted text.. It is categorised as a Destructive tool in the Structured Data Validator MCP Server, which means it can permanently delete or destroy data. Block by default and require explicit approval.

How do I enforce a policy on clean_text? +

Add a rule in your Intercept YAML policy under the tools section for clean_text. You can allow, deny, rate-limit, or validate arguments. Then run Intercept as a proxy in front of the Structured Data Validator MCP server.

What risk level is clean_text? +

clean_text is a Destructive tool with critical risk. Critical-risk tools should be blocked by default and only enabled with explicit human approval.

Can I rate-limit clean_text? +

Yes. Add a rate_limit block to the clean_text rule in your Intercept policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.

How do I block clean_text completely? +

Set action: deny in the Intercept policy for clean_text. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.

What MCP server provides clean_text? +

clean_text is provided by the Structured Data Validator MCP server (@agenson-horrowitz/structured-data-validator-mcp). Intercept sits as a proxy in front of this server to enforce policies before tool calls reach the server.

Enforce policies on Structured Data Validator

Open source. One binary. Zero dependencies.

npx -y @policylayer/intercept
github.com/policylayer/intercept →
// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.