Low Risk

speech_to_text

Convert speech to text using microphone recording or existing audio file

How to control speech_to_text ↓

What speech_to_text does on A-Modular-Kingdom

AI agents call speech_to_text to retrieve information from A-Modular-Kingdom without modifying anything — typically the context-gathering step in research, monitoring, and reporting workflows, before the agent takes action elsewhere.

Low Risk

Why speech_to_text needs a policy

Speech-to-text transcription is a read operation that captures or processes audio input and outputs text. It does not create, modify, delete, or execute external operations; it merely transforms one data format into another. The most severe concurrent tool on the server (code_execute, delete_memory) does not elevate this tool's classification. Confidence is high because the description is clear and unambiguous.

From the tool's definition Tool name and description indicate it 'Convert[s] speech to text' from 'microphone recording or existing audio file' — a data retrieval/transcription operation with no side effects on system state.

Documented attack patterns abuse exactly the kind of access speech_to_text gives an agent:

How to control speech_to_text

PolicyLayer is an MCP gateway — it sits between your AI agents and A-Modular-Kingdom, and nothing reaches the server without passing your rules. This is the rule we recommend for speech_to_text:

policy.json
{
  "version": "1",
  "default": "deny",
  "tools": {
    "speech_to_text": {}
  }
}

speech_to_text is read-only, so it stays allowed — but everything else on the server is denied unless you say otherwise.

  1. Create a free account and register A-Modular-Kingdom — nothing to install.
  2. Add this policy — paste it, or build it visually.
  3. Point your MCP client (Claude, Cursor, anything) at your gateway URL.
CAP THIS TOOL →

Free to start. No card required.

Related tools and policies

Go deeper

Questions about speech_to_text

What does the speech_to_text tool do? +

Convert speech to text using microphone recording or existing audio file. It is categorised as a Read tool in the A-Modular-Kingdom MCP Server, which means it retrieves data without modifying state.

How do I enforce a policy on speech_to_text? +

Register the A-Modular-Kingdom MCP server in PolicyLayer and add a rule for speech_to_text: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches A-Modular-Kingdom. Nothing to install.

What risk level is speech_to_text? +

speech_to_text is a Read tool with low risk. Read-only tools are generally safe to allow by default.

Can I rate-limit speech_to_text? +

Yes. Add a rate_limit block to the speech_to_text rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.

How do I block speech_to_text completely? +

Set action: deny in the PolicyLayer policy for speech_to_text. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.

What MCP server provides speech_to_text? +

speech_to_text is provided by the A-Modular-Kingdom MCP server (masihmoafi/a-modular-kingdom). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.

Enforce policy on every A-Modular-Kingdom tool call.

Start from A-Modular-Kingdom, add the rest of your stack, and see everything your agents can call. Then put policy on all of it.

Free to start. No card required.

14 A-Modular-Kingdom tools catalogued and risk-classified — across an index of 43,000+ MCP servers.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.