// MCP TOOL REFERENCE

WEB SCRAPER TOOLS

62 tools from the Web Scraper MCP Server, categorised by risk level.

// ALL 62 WEB SCRAPER TOOLS

READ 27 tools

Read batch_contacts Extract contacts from multiple URLs in parallel. Read browser_accessibility_tree Return a trimmed Playwright accessibility snapshot. Read browser_get_elements Find elements matching a CSS selector on the current page. Read browser_get_interaction_map Return a compact map of interactive elements with selector hints. Read browser_read_page Read the content of the current page or a specific element. Read browser_screenshot Capture a screenshot of the current page. Read chunk_text Split text into overlapping chunks for LLM processing. Read crawl_site Crawl a site's sitemap to discover pages. Read detect_content_type Detect content type of URL (HTML, PDF, image, etc.). Read extract_contacts Extract all contact information from a URL. Read extract_links Extract all hyperlinks from a webpage. Read extract_tables Extract structured table data from webpage. Read get_cache_stats Get response cache statistics (hits, misses, size). Read get_config Get current configuration settings. Read get_history Get recent scraping history. Read get_host_profiles Return host profile learning store (all hosts or one host). Read get_metadata Extract semantic metadata (JSON-LD, OpenGraph, TwitterCards). Read get_sitemap Smart Sitemap Discovery and Filtering. Read get_token_count Estimate token count for text. Read health_check Check system health. Returns status of browser, cache, sessions. Read list_jobs List recent async jobs and their statuses. Read list_sessions List all saved browser sessions. Read poll_job Get current status of a job started by `start_job`. Read scrape_url Scrape a URL and return its content. Read screenshot Capture a screenshot of a webpage. Read search_web Perform a web search and return results. Read validate_url Validate URL reachability before scraping. Returns status, content type, size.

WRITE 8 tools

Write download_file Download file from URL. Saves PDFs, images, documents directly. Write configure_host_learning Configure host-profile auto-learning behavior. Write configure_retry Configure retry behavior with exponential backoff. Write configure_runtime Apply runtime override values without restarting the MCP server. Write configure_scraper Configure browser settings. Write configure_stealth Configure stealth mode and robots.txt compliance. Write save_pdf Save a URL as a PDF file. Write set_host_profile Set active host routing profile from JSON payload (admin override).

DESTRUCTIVE 4 tools

Destructive clear_cache Clear the response cache. Use when cached data may be stale. Destructive clear_history Clear scraping history. Destructive clear_host_profile Delete one host profile record from the profile store. Destructive clear_session Clear a browser session (cookies, storage). Use for fresh starts.

EXECUTE 22 tools

Execute cancel_job Cancel a running async job. Execute browser_evaluate Run a JavaScript expression on the current page and return the result. Execute browser_navigate Navigate the interactive browser to a URL. Execute browser_solve_challenge Explicitly trigger challenge detection and solving on the current page. Execute browser_wait_for Wait for selector state or a fixed delay on the active page. Execute new_session Start a fresh browser session, clearing all existing sessions. Execute run_bot_surface_diagnostic Run script-level bot surface diagnostics (scripts/bot_check.py). Execute run_browser_info_diagnostic_tool Collect browser fingerprint telemetry via scripts/get_browser_info.py. Execute run_challenge_diagnostic Run target-site diagnostics in either toolkit-native or matrix smoking-gun mode. Execute run_playbook Execute an Autonomous Crawl using a Playbook. Execute start_job Start a long-running job and return immediately with a `job_id`. Execute batch_scrape Scrape multiple URLs in parallel. Execute browser_hover Hover over an element by CSS selector. Execute deep_research Perform Deep Research (Search + Crawl + Report). Execute reload_runtime_config Reload runtime settings from config files. Execute browser_click Click an element on the current page by CSS selector. Execute browser_close Close the interactive browser session and free resources. Execute browser_press_key Press a keyboard key (Enter, Escape, Tab, ArrowDown, etc.). Execute browser_scroll Scroll page content or a specific scrollable element. Execute browser_type Type text into an input field on the current page. Execute click_element Navigate to URL and click an element (for JS triggers, expanding sections). Execute fill_form fill_form

OTHER 1 tools

Other truncate_text Truncate text to fit within token limit.

// FAQ

How many tools does the Web Scraper MCP server have? +

The Web Scraper MCP server exposes 62 tools across 5 categories: Read, Write, Destructive, Execute, Other.

How do I enforce policies on Web Scraper tools? +

Route the Web Scraper server through the PolicyLayer gateway. Define allow, deny, or approval rules per tool in the dashboard; they are enforced on every call before it reaches the server.

What risk categories do Web Scraper tools fall into? +

Web Scraper tools are categorised as Read (27), Write (8), Destructive (4), Execute (22), Other (1). Each category has a recommended default policy.

Enforce policy on every Web Scraper tool call.

Start from Web Scraper, add the rest of your stack, and see everything your agents can call. Then put policy on all of it.

CHECK YOUR STACK →

Free to start. No card required.

43,000+ MCP servers and 220,000+ tools scanned and risk-classified.