Multi-Agent Pentest¶
The Multi-Agent Pentest is an autonomous, black-box penetration testing engine that deploys a team of coordinated AI agents inside a sandboxed environment. A core agent orchestrates specialised child agents that perform reconnaissance, targeted exploitation, validation, and reporting — mirroring the workflow of a human pentester.
Capabilities¶
- Multi-agent orchestration: A core agent spawns and coordinates focused child agents (e.g. "SQLi Discovery", "XSS Validation", "Auth Testing") to divide the attack surface
- Full sandbox environment: Agents run inside a sandbox with classic pentesting tools, a browser, and an http proxy.
- Broad vulnerability coverage: XSS, SQL injection, IDOR/BOLA, SSRF, command injection, access control, and business logic flaws
- Evidence-rich reporting: Findings include curl requests, responses, commands, and step-by-step reasoning
- Real-time activity streaming: Agent thinking and tool calls are streamed as scan events so you can follow the pentest live
Configuration¶
Basic Configuration¶
The multi-agent pentest is enabled by default on all automated pentesting scans. No additional configuration is required.
To explicitly disable it:
Max Duration¶
Use automated_pentesting.multi_agent_pentest.max_duration to control how long the pentest runs before stopping. The value is in minutes.
- Defaults to 360 (6 hours).
- Minimum: 1 minute. Maximum: 1440 minutes (24 hours).
- The orchestrator receives remaining-time warnings so it can prioritize wrap-up before the deadline.
Natural-Language Instructions¶
Use automated_pentesting.multi_agent_pentest.instructions to guide the pentest scope, priorities, or constraints. This field is optional but helps the agents focus on what matters most.
automated_pentesting:
multi_agent_pentest:
enabled: true
instructions: |
Focus on the checkout and payment flow. The /api/v2/orders and
/api/v2/payments endpoints are the highest-priority targets.
Avoid the /admin panel entirely. Authentication tokens expire
after 15 minutes — re-authenticate if you get 401s.
Good things to include:
- High-value endpoints or workflows to prioritize
- Areas or endpoints to avoid (destructive actions, out-of-scope domains)
- Specific vulnerability classes to focus on
- Authentication or session quirks (token expiry, CSRF requirements)
- Known technologies or frameworks the target uses
Mode¶
Use automated_pentesting.multi_agent_pentest.mode to control how aggressively the pentest explores the target surface.
| Value | Behavior |
|---|---|
STANDARD (default) | Pentest scope is dynamic. Any findings related to the listed assets are automatically added to the scope. |
STRICT | Pentest is strictly limited to the listed assets. No discovery beyond the initial scope. |
Scope¶
Use automated_pentesting.multi_agent_pentest.scope to define per-surface URL rules for the pentest. The shape is the same as the WebApp DAST scope configuration (frontend_dast.scope), with sub-keys for api_testing and crawling allowlists/blocklists.
automated_pentesting:
multi_agent_pentest:
enabled: true
scope:
api_testing:
extend_global_scope: true
crawling:
extend_global_scope: true
Context¶
Use automated_pentesting.multi_agent_pentest.context to provide free-text background that the agents should treat as factual context. This complements instructions with information about the product area, threat model, or environment quirks.
automated_pentesting:
multi_agent_pentest:
enabled: true
context: |
This is a multi-tenant SaaS billing service. Each tenant has
an isolated database schema. The /api/internal/ prefix is only
reachable from the VPN and should not be tested.
Files¶
Use automated_pentesting.multi_agent_pentest.files to attach uploaded files (documentation, OpenAPI exports, screenshots) to the pentest. Pass a list of file UUIDs; each file must belong to the same organization as the scan.
automated_pentesting:
multi_agent_pentest:
enabled: true
files:
- "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
Authentication¶
When the target requires authentication, configure it in the scan's authentication block. The agents receive the authentication configuration and use it to access protected surfaces.
automated_pentesting:
multi_agent_pentest:
enabled: true
authentication:
presets:
- type: headers
users:
- username: user@example.com
headers:
Authorization: "Bearer eyJhbGciOiJIUzI1NiJ9..."
Vulnerability Categories¶
Findings are automatically classified into one of the following categories:
| Category | Examples |
|---|---|
| XSS | Reflected, stored, and DOM-based cross-site scripting |
| SQL Injection | Error-based, union-based, blind, and time-based SQL injection |
| SSRF | Server-side request forgery, internal service access |
| Command Injection | OS command injection, remote code execution |
| Access Control | IDOR, privilege escalation, authentication bypass, broken authorization |
| Business Logic | Workflow manipulation, race conditions, state tampering |
Requirements¶
- Reachable target: The scan must be able to reach the target URL
- Supported scans: Web applications (
AUTOMATED_PENTEST_WEBAPP) as well as REST (AUTOMATED_PENTEST_REST) and GraphQL (AUTOMATED_PENTEST_GRAPHQL) API scans - Authentication (optional): Configure when important surfaces are behind login
API scans (REST / GraphQL)¶
For API scans, the engine targets the API base URL (no browser workflow) and receives a schema artifact pre-seeded inside the sandbox workspace so the agents can enumerate the attack surface before probing it:
- REST: the OpenAPI specification (when available through the scan) is written to
/workspace/schema.openapi.json. If no OpenAPI spec is attached, the known endpoint list is written to/workspace/schema.endpoints.json. - GraphQL: the queries / mutations / subscriptions metadata is written to
/workspace/schema.operations.json, and the full SDL is written to/workspace/schema.graphqlwhen available.
You can refer to these files directly from instructions, for example:
automated_pentesting:
multi_agent_pentest:
enabled: true
instructions: |
Start by `cat /workspace/schema.openapi.json` to enumerate endpoints.
Focus on /v1/orders and /v1/payments. Skip reflected XSS on this API.
Limitations¶
- Coverage depends on what the agents can discover and reach within the scan timeout
- The default scan timeout is 6 hours (configurable via
max_duration, up to 24 hours) - Agents stay on the configured target domain and do not navigate to external sites
- Natural-language instructions improve focus but do not replace scan scope or authentication setup
Related Documentation¶
- How It Works: Understanding AI pentesting capabilities
- XSS Agent: Dedicated XSS testing agent
- SQLI Agent: Dedicated SQL injection testing agent
- BOLA Agent: Authorization testing agent
- Business Logic Agent: Business workflow testing agent
- Authentication: Set up authentication for scans