Search for datasets (packages) on a CKAN server using Solr query syntax. Supports full Solr search capabilities including filters, facets, and sorting. Use this to discover datasets matching specific criteria. Note on parser behavior: Some CKAN portals use a restrictive default query parser tha...
High parameter count (15 properties)
Part of the Ckan MCP server. Enforce policies on this tool with Intercept, the open-source MCP proxy.
AI agents call ckan_package_search to retrieve information from Ckan without modifying any data. This is common in research, monitoring, and reporting workflows where the agent needs context before taking action. Because read operations don't change state, they are generally safe to allow without restrictions -- but you may still want rate limits to control API costs.
Even though ckan_package_search only reads data, uncontrolled read access can leak sensitive information or rack up API costs. An agent caught in a retry loop could make thousands of calls per minute. A rate limit gives you a safety net without blocking legitimate use.
Read-only tools are safe to allow by default. No rate limit needed unless you want to control costs.
tools:
ckan_package_search:
rules:
- action: allow See the full Ckan policy for all 20 tools.
Agents calling read-class tools like ckan_package_search have been implicated in these attack patterns. Read the full case and prevention policy for each:
Other tools in the Read risk category across the catalogue. The same policy patterns (rate-limit, allow) apply to each.
Search for datasets (packages) on a CKAN server using Solr query syntax. Supports full Solr search capabilities including filters, facets, and sorting. Use this to discover datasets matching specific criteria. Note on parser behavior: Some CKAN portals use a restrictive default query parser that can break long OR queries. For those portals, this tool may force the query into 'text:(...)' based on per-portal config. You can override with 'query_parser' to force or disable this behavior per request. Important - Date field semantics: - issued: publisher's content publish date when available (best proxy for "created/published") - modified: publisher's content update date when available - metadata_created: CKAN record creation timestamp (publish time on source portals, harvest time on aggregators; fallback for "created" if issued missing) - metadata_modified: CKAN record update timestamp (publish time on source portals, harvest time on aggregators; use for "updated/modified in last X") Natural language mapping (important for tool callers): - "created"/"published" -> prefer issued; fallback to metadata_created - "updated"/"modified" -> prefer modified; fallback to metadata_modified - For "recent in last X", consider using content_recent (issued with metadata_created fallback) Content-recent helper: - content_recent: if true, rewrites the query to use issued with a fallback to metadata_created when issued is missing. - content_recent_days: window for content_recent (default 30 days). Args: - server_url (string): Base URL of CKAN server (e.g., "https://dati.gov.it/opendata") - q (string): Search query using Solr syntax (default: "*:*" for all) - fq (string): Filter query (e.g., "organization:comune-palermo") IMPORTANT — Solr fq syntax rules: 1. OR inside a single field: use field:(val1 OR val2), NOT field:val1 OR field:val2. Wrong: fq=type:"A" OR type:"B" → silently ignored, returns entire catalog. Right: fq=type:("A" OR "B") 2. CKAN extras fields are indexed as extras_fieldname, not fieldname. e.g. to filter on extra field "hvd_category" use fq=extras_hvd_category:"<value>" - rows (number): Number of results to return (default: 10, max: 1000) - start (number): Offset for pagination (default: 0) - page (number): Page number (1-based); alias for start. Overrides start if provided. - page_size (number): Results per page when using page (default: 10, max: 1000) - sort (string): Sort field and direction (e.g., "metadata_modified desc") - facet_field (array): Fields to facet on (e.g., ["organization", "tags"]) - facet_limit (number): Max facet values per field (default: 50) - include_drafts (boolean): Include draft datasets (default: false) - query_parser ('default' | 'text'): Override search parser behavior - response_format ('markdown' | 'json'): Output format Returns: Search results with: - count: Number of results found - results: Array of dataset objects - facets: Facet counts (if facet_field specified) - search_facets: Detailed facet information Query Syntax (parameter q): Boolean operators: - AND / &&: "water AND climate" - OR / ||: "health OR sanità" - NOT / !: "data NOT personal" - +required -excluded: "+title:water -title:sea" - Grouping: "(title:water OR title:climate) AND tags:environment" Wildcards: - *: "title:environment*" (matches environmental, environments, etc.) - Note: Left truncation (*water) not supported Fuzzy search (edit distance): - ~: "title:rest~" or "title:rest~1" (finds "test", "best", "rest") Proximity search (words within N positions): - "phrase"~N: "title:"climate change"~5" Range queries: - Inclusive [a TO b]: "num_resources:[5 TO 10]" - Exclusive {a TO b}: "num_resources:{0 TO 100}" - One side open: "metadata_modified:[2024-01-01T00:00:00Z TO *]" Date math: - NOW-1YEAR, NOW-6MONTHS, NOW-7DAYS, NOW-1HOUR - NOW/DAY, NOW/MONTH (round down) - Combined: "metadata_modified:[NOW-2MONTHS TO NOW]" - Example: "metadata_created:[NOW-1YEAR TO *]" - IMPORTANT: NOW syntax works on metadata_modified and metadata_created fields - For 'modified' and 'issued' fields, NOW syntax is auto-converted to ISO dates - Manual ISO dates always work: "modified:[2026-01-15T00:00:00Z TO *]" Field existence: - Exists: "field:*" or "field:[* TO *]" - Not exists: "NOT field:*" or "-field:*" Boosting (relevance scoring): - Boost term: "title:water^2 OR notes:water" (title matches score higher) - Constant score: "title:water^=1.5" Examples: - Search all: { q: "*:*" } - By tag: { q: "tags:sanità" } - Boolean: { q: "(title:water OR title:climate) AND NOT title:sea" } - Wildcard: { q: "title:environment*" } - Fuzzy: { q: "title:health~2" } - Proximity: { q: "notes:"open data"~3" } - Date range: { q: "metadata_modified:[2024-01-01T00:00:00Z TO 2024-12-31T23:59:59Z]" } - Date math: { q: "metadata_modified:[NOW-6MONTHS TO *]" } - Date math (auto-converted): { q: "modified:[NOW-30DAYS TO NOW]" } - Published in 2025 (content date): { fq: "issued:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]" } - First appeared on portal in 2025: { fq: "metadata_created:[2025-01-01T00:00:00Z TO 2025-12-31T23:59:59Z]" } - Recent content (issued w/ fallback): { q: "*:*", content_recent: true, content_recent_days: 180 } - Field exists: { q: "organization:* AND num_resources:[1 TO *]" } - Boosting: { q: "title:climate^2 OR notes:climate" } - Filter org: { fq: "organization:regione-siciliana" } - Filter extras field (correct): { fq: "extras_hvd_category:"http://data.europa.eu/bna/c_ac64a52d"" } - Filter extras OR (correct): { fq: "extras_hvd_category:("http://data.europa.eu/bna/c_ac64a52d" OR "http://data.europa.eu/bna/c_dd313021")" } - Get facets: { facet_field: ["organization"], rows: 0 } Query language: Before searching a portal, check its locale via ckan_status_show (field: "Portal Locale" / locale_default). Translate query terms to the portal's language — searching in English on a non-English portal returns 0 results. Examples: locale "it" → Italian terms; "uk_UA" → Ukrainian (Cyrillic); "fr_FR" → French. Exception: multilingual portals (e.g. data.europa.eu, open.canada.ca) accept EN + native terms joined with OR. Typical workflow: ckan_status_show (check locale) → ckan_package_search (query in portal's language) → ckan_package_show (get full metadata + resource IDs) → ckan_datastore_search (query tabular data). It is categorised as a Read tool in the Ckan MCP Server, which means it retrieves data without modifying state.
Add a rule in your Intercept YAML policy under the tools section for ckan_package_search. You can allow, deny, rate-limit, or validate arguments. Then run Intercept as a proxy in front of the Ckan MCP server.
ckan_package_search is a Read tool with low risk. Read-only tools are generally safe to allow by default.
Yes. Add a rate_limit block to the ckan_package_search rule in your Intercept policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.
Set action: deny in the Intercept policy for ckan_package_search. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.
ckan_package_search is provided by the Ckan MCP server (@aborruso/ckan-mcp-server). Intercept sits as a proxy in front of this server to enforce policies before tool calls reach the server.
Open source. One binary. Zero dependencies.
npx -y @policylayer/intercept