Filter sensitive words from text by replacing them with a replacement string
AI agents use filter_sensitive_words to create or update resources in Sensitive Lexicon — usually the action step of a workflow, after the agent has gathered context. Every call changes real data in your Sensitive Lexicon environment.
This tool modifies text content by replacing sensitive words with a substitute string. It creates a transformed version of the input text — a reversible, non-destructive modification — which fits the Write category. It does not delete data permanently, execute code, or involve financial transactions.
From the tool's definition Filter sensitive words from text by replacing them with a replacement string
Documented attack patterns abuse exactly the kind of access filter_sensitive_words gives an agent:
PolicyLayer is an MCP gateway — it sits between your AI agents and Sensitive Lexicon, and nothing reaches the server without passing your rules. This is the rule we recommend for filter_sensitive_words:
{
"version": "1",
"default": "deny",
"tools": {
"filter_sensitive_words": {
"limits": [
{
"counter": "filter_sensitive_words_rate",
"window": "minute",
"max": 30,
"scope": "grant"
}
]
}
}
} filter_sensitive_words stays usable, but capped — an agent stuck in a loop can't make hundreds of changes a minute. Everything else on the server is denied unless you say otherwise.
Free to start. No card required.
Filter sensitive words from text by replacing them with a replacement string. It is categorised as a Write tool in the Sensitive Lexicon MCP Server, which means it can create or modify data. Consider rate limits to prevent runaway writes.
Register the Sensitive Lexicon MCP server in PolicyLayer and add a rule for filter_sensitive_words: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches Sensitive Lexicon. Nothing to install.
filter_sensitive_words is a Write tool with medium risk. Write tools should be rate-limited to prevent accidental bulk modifications.
Yes. Add a rate_limit block to the filter_sensitive_words rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.
Set action: deny in the PolicyLayer policy for filter_sensitive_words. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.
filter_sensitive_words is provided by the Sensitive Lexicon MCP server (zephyrpersonal/sensitive-lexicon-mcp). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.
Start from Sensitive Lexicon, add the rest of your stack, and see everything your agents can call. Then put policy on all of it.
Free to start. No card required.
4 Sensitive Lexicon tools catalogued and risk-classified — across an index of 43,000+ MCP servers.