Merge multiple PentesterFlow datasets into a single file with deduplication and balanced sampling.
AI agents use merge_datasets to create or update resources in OffensiveSET — usually the action step of a workflow, after the agent has gathered context. Every call changes real data in your OffensiveSET environment.
The tool combines multiple datasets into one output file, which is a data creation/modification action. While the server generates penetration testing datasets (offensive security content), the merge_datasets tool itself performs a data transformation and consolidation operation — a Write action rather than Execute or Destructive.
From the tool's definition Tool description states it 'Merge[s] multiple PentesterFlow datasets into a single file' — this creates or modifies a dataset file by combining existing datasets with deduplication and sampling, which are reversible write operations.
Documented attack patterns abuse exactly the kind of access merge_datasets gives an agent:
PolicyLayer is an MCP gateway — it sits between your AI agents and OffensiveSET, and nothing reaches the server without passing your rules. This is the rule we recommend for merge_datasets:
{
"version": "1",
"default": "deny",
"tools": {
"merge_datasets": {
"limits": [
{
"counter": "merge_datasets_rate",
"window": "minute",
"max": 30,
"scope": "grant"
}
]
}
}
} merge_datasets stays usable, but capped — an agent stuck in a loop can't make hundreds of changes a minute. Everything else on the server is denied unless you say otherwise.
Free to start. No card required.
Merge multiple PentesterFlow datasets into a single file with deduplication and balanced sampling. It is categorised as a Write tool in the OffensiveSET MCP Server, which means it can create or modify data. Consider rate limits to prevent runaway writes.
Register the OffensiveSET MCP server in PolicyLayer and add a rule for merge_datasets: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches OffensiveSET. Nothing to install.
merge_datasets is a Write tool with medium risk. Write tools should be rate-limited to prevent accidental bulk modifications.
Yes. Add a rate_limit block to the merge_datasets rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.
Set action: deny in the PolicyLayer policy for merge_datasets. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.
merge_datasets is provided by the OffensiveSET MCP server (pentesterflow/offensiveset). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.
Deterministic rules across all 10 OffensiveSET tools. Per-identity grants. Full audit log. Live in minutes. Nothing to install.
Free to start. No card required.
10 OffensiveSET tools catalogued and risk-classified — across an index of 42,500+ MCP servers.