Multi-Agent Pentest¶
The Multi-Agent Pentest is an autonomous, black-box penetration testing engine that deploys a team of coordinated AI agents inside a sandboxed environment. A core agent orchestrates specialised child agents that perform reconnaissance, targeted exploitation, validation, and reporting, mirroring the workflow of a human pentester.
Capabilities¶
- Multi-agent orchestration: A core agent spawns and coordinates focused child agents (e.g. "SQLi Discovery", "XSS Validation", "Auth Testing") to divide the attack surface
- Full sandbox environment: Agents run inside a sandbox with classic pentesting tools, a browser, and an http proxy.
- Broad vulnerability coverage: XSS, SQL injection, IDOR/BOLA, SSRF, command injection, access control, and business logic flaws
- Evidence-rich reporting: Findings include curl requests, responses, commands, and step-by-step reasoning
- Real-time activity streaming: Agent thinking and tool calls are streamed as scan events so you can follow the pentest live
How It Is Used¶
The Multi-Agent Pentest runs automatically after the initial crawling phase. You cannot enable, disable, or tune the orchestrator directly from this page.
Use the AI Pentesting stepper to influence the orchestrator:
- Scope mode: Use Standard when related discovered assets can be included in the pentest, or Strict when testing must stay limited to the listed URLs.
- Scope restrictions: Keep crawling or active API testing away from destructive, sensitive, or out-of-scope paths.
- Authentication: Add users and natural-language sign-in instructions when protected surfaces matter.
- Fine-Tune (Optional) > Context: Add high-value endpoints or workflows, areas to avoid, vulnerability classes to focus on, authentication quirks, session quirks, and known technologies.
- Fine-Tune (Optional) > Duration: Control the maximum scan duration and rate limit from the stepper.
- Fine-Tune (Optional) > Artifacts: Attach supporting files, such as pentest reports, documentation, OpenAPI exports, or screenshots.
The orchestrator receives the crawler output, the configured users, the context, the scope, the duration budget, and the attached artifacts. It then coordinates child agents and prioritizes work within those boundaries.
Vulnerability Categories¶
Findings are automatically classified into one of the following categories:
| Category | Examples |
|---|---|
| XSS | Reflected, stored, and DOM-based cross-site scripting |
| SQL Injection | Error-based, union-based, blind, and time-based SQL injection |
| SSRF | Server-side request forgery, internal service access |
| Command Injection | OS command injection, remote code execution |
| Access Control | IDOR, privilege escalation, authentication bypass, broken authorization |
| Business Logic | Workflow manipulation, race conditions, state tampering |
Requirements¶
- Reachable target: The scan must be able to reach the target URL
- Supported scans: Web applications, REST APIs, and GraphQL APIs
- Authentication (optional): Configure when important surfaces are behind login
API scans (REST / GraphQL)¶
For API scans, the engine targets the API base URL (no browser workflow) and receives a schema artifact pre-seeded inside the sandbox workspace so the agents can enumerate the attack surface before probing it:
- REST: the OpenAPI specification (when available through the scan) is written to
/workspace/schema.openapi.json. If no OpenAPI spec is attached, the known endpoint list is written to/workspace/schema.endpoints.json. - GraphQL: the queries / mutations / subscriptions metadata is written to
/workspace/schema.operations.json, and the full SDL is written to/workspace/schema.graphqlwhen available.
Limitations¶
- Coverage depends on what the agents can discover and reach within the scan timeout
- The default scan timeout is 6 hours and can be adjusted in the stepper up to 24 hours
- Agents stay on the configured target domain and do not navigate to external sites
- Context improves focus but does not replace scan scope or authentication setup
Related Documentation¶
- How It Works: Understanding AI pentesting capabilities
- XSS Agent: Dedicated XSS testing agent
- SQLI Agent: Dedicated SQL injection testing agent
- BOLA Agent: Authorization testing agent
- Business Logic Agent: Business workflow testing agent
- Authentication: Set up authentication for scans