Cascade: Multi-Agent Pentest¶
Cascade is Escape's multi-agent pentest engine and the core of AI Pentesting. It's an autonomous, black-box penetration testing engine that deploys a team of coordinated AI agents inside a sandboxed environment. An orchestrator agent plans the engagement and spawns specialized worker agents that perform reconnaissance, targeted exploitation, validation, and reporting, mirroring the workflow of a human pentester.
Cascade replaces the earlier model of separate, single-purpose agents (one each for XSS, SQLi, IDOR, and so on). Instead of a fixed agent list, Cascade creates the specialists a target actually needs, on demand.
Architecture¶
Cascade is built from four roles:
- Orchestrator: plans the engagement, breaks it into tasks, spawns workers, and decides when the engagement is complete. It coordinates the swarm and prioritizes work within the configured scope, users, context, and time budget.
- Workers: focused agents the orchestrator creates for a specific job (for example "SQLi discovery on the reporting API", "XSS validation", "auth testing across tenants"). Workers are created dynamically and run in parallel; once a worker's task returns, it stops consuming budget.
- Reporter: receives candidate findings from workers and independently reproduces each one on the live target, collecting its own evidence before filing an issue. The reporter is deliberately isolated from worker-to-worker messaging so its verification stays independent.
- Coverage agent: an advisory auditor that tracks tested surfaces and proposes follow-up work from coverage gaps. It has no exploitation tools of its own.
Workers coordinate through a shared message bus (seeded with topics such as recon, xss, sqli, idor, ssrf, auth, and rce) and a shared knowledge store, so signal discovered by one worker reaches the rest of the swarm quickly.
Skills¶
Cascade agents don't carry one giant prompt. They load modular skills: focused playbooks, scoped to the task in front of them, so each agent's context stays small and relevant. A skill is a self-contained playbook for one topic: the methodology for a vulnerability class, how to drive a specific tool, or what to watch for in a given framework.
Skills are grouped into categories:
- Vulnerability skills: per-class methodology for cross-site scripting, SQL injection, IDOR, broken function-level authorization, business logic, SSRF, command injection (RCE), SSTI, XXE, CSRF, mass assignment, path traversal (LFI/RFI), open redirect, insecure deserialization, insecure file uploads, parameter pollution, race conditions, subdomain takeover, information disclosure, and authentication/JWT.
- Tooling skills: how to drive classic pentesting tools inside the sandbox, including
nmap,naabu,httpx,subfinder,katana,ffuf,nuclei,sqlmap,semgrep, and WAF-bypass techniques. - Framework skills: framework-aware testing for Next.js, FastAPI, and NestJS.
- Protocol skills: GraphQL, OAuth, and HTTP request smuggling.
- Technology skills: Supabase, Firebase and Firestore, and AI/LLM application security.
- Reconnaissance skills: OSINT and discovery techniques.
Two mechanics make this work:
- Scoped loading. The orchestrator loads the coordination skills it needs to plan and delegate. Workers load the vulnerability, tooling, framework, protocol, and technology skills relevant to their task, so an agent only sees the skills it can act on.
- Shared depth profile. Every agent in a run loads one scan-mode skill, so the whole swarm agrees on how deep to go for that assessment.
Because skills are data, not code, coverage grows by adding a skill rather than by shipping a new agent. That's the core difference from the old per-agent model.
Capabilities¶
- Multi-agent orchestration: an orchestrator spawns and coordinates focused workers to divide the attack surface
- Full sandbox environment: agents run inside a sandbox with classic pentesting tools, a browser, and an HTTP proxy
- Broad vulnerability coverage: XSS, SQL injection, IDOR/BOLA, SSRF, command injection, access control, and business logic flaws
- Independent validation: the reporter re-verifies every candidate against the live target, so confirmed issues stay high-signal
- Evidence-rich reporting: findings include curl requests, responses, commands, and step-by-step reasoning
- Real-time activity streaming: agent thinking and tool calls are streamed as assessment events so you can follow the pentest live
How It Is Used¶
Cascade runs automatically after the initial crawling phase. You cannot enable, disable, or tune individual agents: the orchestrator decides which workers to spawn.
Use the New Pentest creation form to influence the orchestrator:
- Scope mode: Use Standard when related discovered assets can be included in the pentest, or Strict when testing must stay limited to the listed URLs.
- Scope restrictions: Keep crawling or active API testing away from destructive, sensitive, or out-of-scope paths.
- Authentication: Add users and natural-language sign-in instructions when protected surfaces matter.
- Fine-Tune (Optional) > Context: Add high-value endpoints or workflows, areas to avoid, vulnerability classes to focus on, authentication quirks, session quirks, and known technologies.
- Fine-Tune (Optional) > Duration: Control the maximum assessment duration and rate limit from the form.
- Fine-Tune (Optional) > Artifacts: Attach pentest reports, documentation, or source-code archives. Only PDF, plain text, and archive (source code) file types are supported.
The orchestrator receives the crawler output, the configured users, the context, the scope, the duration budget, and the attached artifacts. It then coordinates workers and prioritizes work within those boundaries.
Vulnerability Categories¶
Findings are automatically classified into one of the following categories:
| Category | Examples |
|---|---|
| XSS | Reflected, stored, and DOM-based cross-site scripting |
| SQL Injection | Error-based, union-based, blind, and time-based SQL injection |
| SSRF | Server-side request forgery, internal service access |
| Command Injection | OS command injection, remote code execution |
| Access Control | IDOR, privilege escalation, authentication bypass, broken authorization |
| Business Logic | Workflow manipulation, race conditions, state tampering |
Requirements¶
- Reachable target: The assessment must be able to reach the target URL
- Supported assessments: Web applications, REST APIs, and GraphQL APIs
- Authentication (optional): Configure when important surfaces are behind login
API assessments (REST / GraphQL)¶
For API assessments, the engine targets the API base URL (no browser workflow) and receives a schema artifact pre-seeded inside the sandbox workspace so the agents can enumerate the attack surface before probing it:
- REST: the OpenAPI specification (when available through the assessment) is written to
/workspace/schema.openapi.json. If no OpenAPI spec is attached, the known endpoint list is written to/workspace/schema.endpoints.json. - GraphQL: the queries / mutations / subscriptions metadata is written to
/workspace/schema.operations.json, and the full SDL is written to/workspace/schema.graphqlwhen available.
Limitations¶
- Coverage depends on what the agents can discover and reach within the assessment timeout
- The default assessment timeout is 6 hours and can be adjusted in the form up to 24 hours
- Agents stay on the configured target domain and do not navigate to external sites
- Context improves focus but does not replace assessment scope or authentication setup
Related Documentation¶
- How It Works: The Cascade workflow end to end
- Graph Reasoning: How Cascade builds attack paths
- Proof of Exploit: The evidence bundle attached to every finding
- Legacy: The Previous Agent System: How the per-class agents map to Cascade
- Authentication: Set up authentication for assessments