Low Risk

tavily-crawl

A powerful web crawler that initiates a structured web crawl starting from a specified base URL. The crawler expands from that point like a tree, following internal links across pages. You can control how deep and wide it goes, and guide it to focus on specific sections of the site.

How to control tavily-crawl ↓

What tavily-crawl does on Tavily Web Search and Extraction Server

AI agents call tavily-crawl to retrieve information from Tavily Web Search and Extraction Server without modifying anything — typically the context-gathering step in research, monitoring, and reporting workflows, before the agent takes action elsewhere.

ParameterTypeRequiredDescription
url string The root URL to begin the crawl
limit integer Total number of links the crawler will process before stopping
max_depth integer Max depth of the crawl. Defines how far from the base URL the crawler can explore.
categories array Filter URLs using predefined categories like documentation, blog, api, etc
max_breadth integer Max number of links to follow per level of the tree (i.e., per page)
instructions string Natural language instructions for the crawler
select_paths array Regex patterns to select only URLs with specific path patterns (e.g., /docs/.*, /api/v1.*)
extract_depth string Advanced extraction retrieves more data, including tables and embedded content, with higher success but may increase latency
allow_external boolean Whether to allow following links that go to external domains
select_domains array Regex patterns to select crawling to specific domains or subdomains (e.g., ^docs\.example\.com$)

Parameters from the server's own tool schema.

Low Risk

Why tavily-crawl needs a policy

This tool retrieves and maps website structure and content through systematic browsing. It has no side effects on the target systems—it reads publicly accessible web data. The blast radius of misuse is limited to potentially generating excessive traffic to a website or gathering information that may already be publicly available.

From the tool's definition Tool description states it 'initiates a structured web crawl starting from a specified base URL' and 'following internal links across pages.' The verb 'crawl' and the focus on 'extract data from web pages' (from server description) indicate data retrieval…

Risk signalsAccepts URL/endpoint input (url) · High parameter count (10 properties)

Documented attack patterns abuse exactly the kind of access tavily-crawl gives an agent:

How to control tavily-crawl

PolicyLayer is an MCP gateway — it sits between your AI agents and Tavily Web Search and Extraction Server, and nothing reaches the server without passing your rules. This is the rule we recommend for tavily-crawl:

policy.json
{
  "version": "1",
  "default": "deny",
  "tools": {
    "tavily-crawl": {}
  }
}

tavily-crawl is read-only, so it stays allowed — but everything else on the server is denied unless you say otherwise.

  1. Create a free account and register Tavily Web Search and Extraction Server — nothing to install.
  2. Add this policy — paste it, or build it visually.
  3. Point your MCP client (Claude, Cursor, anything) at your gateway URL.
CAP THIS TOOL →

Free to start. No card required.

Related tools and policies

Go deeper

Questions about tavily-crawl

What does the tavily-crawl tool do? +

A powerful web crawler that initiates a structured web crawl starting from a specified base URL. The crawler expands from that point like a tree, following internal links across pages. You can control how deep and wide it goes, and guide it to focus on specific sections of the site. It is categorised as a Read tool in the Tavily Web Search and Extraction Server MCP Server, which means it retrieves data without modifying state.

What parameters does tavily-crawl accept? +

tavily-crawl accepts 10 parameters: url, limit, max_depth, categories, max_breadth, instructions, select_paths, extract_depth, allow_external, select_domains. The full parameter table on this page comes from the server's own tool schema.

How do I enforce a policy on tavily-crawl? +

Register the Tavily Web Search and Extraction Server MCP server in PolicyLayer and add a rule for tavily-crawl: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches Tavily Web Search and Extraction Server. Nothing to install.

What risk level is tavily-crawl? +

tavily-crawl is a Read tool with low risk. Read-only tools are generally safe to allow by default.

Can I rate-limit tavily-crawl? +

Yes. Add a rate_limit block to the tavily-crawl rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.

How do I block tavily-crawl completely? +

Set action: deny in the PolicyLayer policy for tavily-crawl. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.

What MCP server provides tavily-crawl? +

tavily-crawl is provided by the Tavily Web Search and Extraction Server MCP server (avac22/tavily-mcp). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.

Enforce policy on every Tavily Web Search and Extraction Server tool call.

Start from Tavily Web Search and Extraction Server, add the rest of your stack, and see everything your agents can call. Then put policy on all of it.

Free to start. No card required.

4 Tavily Web Search and Extraction Server tools catalogued and risk-classified — across an index of 43,000+ MCP servers.

// GET IN TOUCH

Have a question or want to learn more? Send us a message.

Message sent.

We'll get back to you soon.