Generate an offensive security dataset for fine-tuning PentesterFlow model. Produces JSONL in ShareGPT/ChatML format with multi-turn pentesting conversations including tool calls, reasoning, and thinking blocks.
AI agents use generate_dataset to create or update resources in OffensiveSET — usually the action step of a workflow, after the agent has gathered context. Every call changes real data in your OffensiveSET environment.
This tool creates new dataset artifacts (JSONL files) which are reversible outputs. While the content describes offensive security scenarios, the tool itself performs data generation and writing operations rather than executing actual attacks or code. It's a Write category tool because it creates structured data outputs that can be modified or deleted, fitting the write pattern of reversible data modification.
From the tool's definition Tool description states it "Produces JSONL in ShareGPT/ChatML format with multi-turn pentesting conversations" - it creates and generates new dataset files that are written to storage. The verb "generates" and "produces" indicate data creation.
Documented attack patterns abuse exactly the kind of access generate_dataset gives an agent:
PolicyLayer is an MCP gateway — it sits between your AI agents and OffensiveSET, and nothing reaches the server without passing your rules. This is the rule we recommend for generate_dataset:
{
"version": "1",
"default": "deny",
"tools": {
"generate_dataset": {
"limits": [
{
"counter": "generate_dataset_rate",
"window": "minute",
"max": 30,
"scope": "grant"
}
]
}
}
} generate_dataset stays usable, but capped — an agent stuck in a loop can't make hundreds of changes a minute. Everything else on the server is denied unless you say otherwise.
Free to start. No card required.
Generate an offensive security dataset for fine-tuning PentesterFlow model. Produces JSONL in ShareGPT/ChatML format with multi-turn pentesting conversations including tool calls, reasoning, and thinking blocks. It is categorised as a Write tool in the OffensiveSET MCP Server, which means it can create or modify data. Consider rate limits to prevent runaway writes.
Register the OffensiveSET MCP server in PolicyLayer and add a rule for generate_dataset: allow, deny, rate-limit, or require approval. Point your MCP client at the PolicyLayer proxy URL and the rule is enforced on every call, before it reaches OffensiveSET. Nothing to install.
generate_dataset is a Write tool with medium risk. Write tools should be rate-limited to prevent accidental bulk modifications.
Yes. Add a rate_limit block to the generate_dataset rule in your PolicyLayer policy. For example, setting max: 10 and window: 60 limits the tool to 10 calls per minute. Rate limits are tracked per agent session and reset automatically.
Set action: deny in the PolicyLayer policy for generate_dataset. The AI agent will receive a policy violation error and cannot call the tool. You can also include a reason field to explain why the tool is blocked.
generate_dataset is provided by the OffensiveSET MCP server (pentesterflow/offensiveset). PolicyLayer sits as a proxy in front of this server to enforce policies before tool calls reach the server.
Deterministic rules across all 10 OffensiveSET tools. Per-identity grants. Full audit log. Live in minutes. Nothing to install.
Free to start. No card required.
10 OffensiveSET tools catalogued and risk-classified — across an index of 42,500+ MCP servers.