High-risk tools in Ia Qa Com/mcp llm and RAG testing Dev/QA toolbox
16 of the 139 tools in Ia Qa Com/mcp llm and RAG testing Dev/QA toolbox are classified as high risk. This page profiles those tools specifically, with recommended policy actions and the attack patterns that target them.
Every operation listed below is an action PolicyLayer recommends controlling at the transport layer. Open any tool to see the full profile, risk score, and YAML policy snippet.
Tools at high risk
-
build_rag_promptExecuteAssemble a complete RAG (Retrieval-Augmented Generation) prompt from retrieved context chunks and a user query. Handles token budgeting, citation numbering, system instruction i...
-
cron_parseExecuteParse a cron expression into a human-readable schedule description. Supports standard 5-field cron (minute hour day month weekday).
-
env_parseExecuteParse a .env file content into a JSON object. Handles quoted values (single and double), inline comments, export prefix, and escaped sequences (\n, \t inside double quotes). Ret...
-
mcp_server_evaluateExecuteRun a full compliance evaluation against a live MCP server URL. Tests: server reachability (ping), manifest discovery (GET /mcp), schema quality (snake_case names, descriptions,...
-
parse_csvExecuteParse a CSV string into a JSON array of objects (or raw arrays). Handles RFC 4180 quoted fields, escaped quotes, and custom delimiters. Use when processing spreadsheet exports, ...
-
parse_http_headersExecuteParse a raw HTTP headers block into a structured JSON object. Detects multi-value headers, masks Authorization values, and optionally audits for missing security headers (HSTS, ...
-
run_eval_contractExecuteParse a .ia-eval.yaml LLM test suite, call the specified LLM model for each scenario, run all configured scorers, and return a structured JSON report with per-scenario Pass/Fail...
-
run_pr_gate_pipelineExecuteFull automated QA pipeline for a pull request. Takes a unified git diff (output of `git diff HEAD`) and returns: bug hotspots, regression impact areas, risk score (0–100), gen...
-
run_semantic_testsExecuteSemantic assertion primitive: compare actual vs expected text pairs using cosine similarity + ROUGE-L. Two modes: tfidf (default, free, no API key) or embeddings (OpenAI text-em...
-
run_vlm_test_suiteExecuteRun a test suite against a Vision-Language Model (VLM) — send an image (URL or base64) + N test cases (each with a question + assertion) to GPT-4o, Claude 3.5, or Gemini. Retu...
-
run_vlm_test_suite_batchExecuteCompare multiple VLMs on the same test suite in parallel — send an image (URL or base64) + N test cases to all models simultaneously. Returns per-model PASS/FAIL verdicts, pas...
-
strip_markdownExecuteStrip all Markdown formatting (headers, bold, italic, code fences, links, lists) from text and return clean plain text. Run this before injecting scraped documentation, README f...
-
system_prompt_builderExecuteBuild a structured system prompt from components: role, task, constraints, output format, tone, language, and examples. Generates a production-ready system prompt with token est...
-
transform_json_arrayExecuteTransform a JSON array using common operations: pluck (extract specific fields), filter (by field value), sort_by (field), group_by (field), count_by (field), uniq_by (field). U...
-
vector_quantizeExecuteSimulate int8 or int4 quantization of float32 embedding vectors. Reduces storage by 4x (int8) or 8x (int4). Returns quantized values, scale factor, and precision loss (MSE). Use...
-
web_security_auditExecuteRun a comprehensive web security audit combining headers, SSL, CORS, and cookies checks — then use an LLM to produce a prioritised remediation plan. Orchestrates security_header...
Attacks that target this class
High-risk tools in any server share these documented attack patterns. Each links to the full case and the defensive policy.