Extract and analyze all URLs from a given web page. This tool crawls the specified webpage, identifies all hyperlinks, and optionally adds them to the processing queue. Useful for discovering related documentation pages, API references, or building a documentation graph. Handles various URL forma...
AI agents call extract_urls to retrieve information from Ragdocs without modifying anything — typically the context-gathering step in research, monitoring, and reporting workflows, before the agent takes action elsewhere.
Even though extract_urls only reads data, uncontrolled read access leaks sensitive information and racks up API costs — an agent caught in a retry loop can make thousands of calls a minute without anyone noticing.
Documented attack patterns abuse exactly the kind of access extract_urls gives an agent:
PolicyLayer is an MCP gateway — it sits between your AI agents and Ragdocs, and nothing reaches the server without passing your rules. This is the rule we recommend for extract_urls:
{
"version": "1",
"default": "deny",
"tools": {
"extract_urls": {}
}
} extract_urls is read-only, so it stays allowed — but everything else on the server is denied unless you say otherwise.
Free to start. No card required.
Extract and analyze all URLs from a given web page. This tool crawls the specified webpage, identifies all hyperlinks, and optionally adds them to the processing queue. Useful for discovering related documentation pages, API references, or building a documentation graph. Handles various URL formats and validates links before extraction. It is categorised as a Read tool in the Ragdocs MCP Server, which means it retrieves data without modifying state.
Register the Ragdocs MCP server in PolicyLayer and add a rule for extract_urls: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches Ragdocs. Nothing to install.
extract_urls is a Read tool with low risk. Read-only tools are generally safe to allow by default.
Yes. Add a rate_limit block to the extract_urls rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.
Set action: deny in the PolicyLayer policy for extract_urls. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.
extract_urls is provided by the Ragdocs MCP server (sanderkooger/mcp-server-ragdocs). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.
Deterministic rules across all 7 Ragdocs tools. Per-identity grants. Full audit log. Live in minutes. Nothing to install.
Free to start. No card required.
7 Ragdocs tools catalogued and risk-classified — across an index of 42,500+ MCP servers.