Cascade: Multi-Agent Pentest¶

Cascade is Escape's multi-agent pentest engine and the core of AI Pentesting. It's an autonomous, black-box penetration testing engine that deploys a team of coordinated AI agents inside a sandboxed environment. An orchestrator agent plans the engagement and spawns specialized worker agents that perform reconnaissance, targeted exploitation, validation, and reporting, mirroring the workflow of a human pentester.

Cascade replaces the earlier model of separate, single-purpose agents (one each for XSS, SQLi, IDOR, and so on). Instead of a fixed agent list, Cascade creates the specialists a target actually needs, on demand.

Architecture¶

Cascade is built from four roles:

Orchestrator: plans the engagement, breaks it into tasks, spawns workers, and decides when the engagement is complete. It coordinates the swarm and prioritizes work within the configured scope, users, context, and time budget.
Workers: focused agents the orchestrator creates for a specific job (for example "SQLi discovery on the reporting API", "XSS validation", "auth testing across tenants"). Workers are created dynamically and run in parallel; once a worker's task returns, it stops consuming budget.
Reporter: receives candidate findings from workers and independently reproduces each one on the live target, collecting its own evidence before filing an issue. The reporter is deliberately isolated from worker-to-worker messaging so its verification stays independent.
Coverage agent: an advisory auditor that tracks tested surfaces and proposes follow-up work from coverage gaps. It has no exploitation tools of its own.

Workers coordinate through a shared message bus (seeded with topics such as recon, xss, sqli, idor, ssrf, auth, and rce) and a shared knowledge store, so signal discovered by one worker reaches the rest of the swarm quickly.

Skills¶

Cascade agents don't carry one giant prompt. They load modular skills: focused playbooks, scoped to the task in front of them, so each agent's context stays small and relevant. A skill is a self-contained playbook for one topic: the methodology for a vulnerability class, how to drive a specific tool, or what to watch for in a given framework.

Skills are grouped into categories:

Vulnerability skills: per-class methodology for cross-site scripting, SQL injection, IDOR, broken function-level authorization, business logic, SSRF, command injection (RCE), SSTI, XXE, CSRF, mass assignment, path traversal (LFI/RFI), open redirect, insecure deserialization, insecure file uploads, parameter pollution, race conditions, subdomain takeover, information disclosure, and authentication/JWT.
Tooling skills: how to drive classic pentesting tools inside the sandbox, including nmap, naabu, httpx, subfinder, katana, ffuf, nuclei, sqlmap, semgrep, and WAF-bypass techniques.
Framework skills: framework-aware testing for Next.js, FastAPI, and NestJS.
Protocol skills: GraphQL, OAuth, and HTTP request smuggling.
Technology skills: Supabase, Firebase and Firestore, and AI/LLM application security.
Reconnaissance skills: OSINT and discovery techniques.

Two mechanics make this work:

Scoped loading. The orchestrator loads the coordination skills it needs to plan and delegate. Workers load the vulnerability, tooling, framework, protocol, and technology skills relevant to their task, so an agent only sees the skills it can act on.
Shared depth profile. Every agent in a run loads one scan-mode skill, so the whole swarm agrees on how deep to go for that assessment.

Because skills are data, not code, coverage grows by adding a skill rather than by shipping a new agent. That's the core difference from the old per-agent model.

Capabilities¶

Multi-agent orchestration: an orchestrator spawns and coordinates focused workers to divide the attack surface
Full sandbox environment: agents run inside a sandbox with classic pentesting tools, a browser, and an HTTP proxy
Broad vulnerability coverage: XSS, SQL injection, IDOR/BOLA, SSRF, command injection, access control, and business logic flaws
Independent validation: the reporter re-verifies every candidate against the live target, so confirmed issues stay high-signal
Evidence-rich reporting: findings include curl requests, responses, commands, and step-by-step reasoning
Real-time activity streaming: agent thinking and tool calls are streamed as assessment events so you can follow the pentest live

How It Is Used¶

Cascade runs automatically after the initial crawling phase. You cannot enable, disable, or tune individual agents: the orchestrator decides which workers to spawn.

Use the New Pentest creation form to influence the orchestrator:

Scope mode: Use Standard when related discovered assets can be included in the pentest, or Strict when testing must stay limited to the listed URLs.
Scope restrictions: Keep crawling or active API testing away from destructive, sensitive, or out-of-scope paths.
Authentication: Add users and natural-language sign-in instructions when protected surfaces matter.
Fine-Tune (Optional) > Context: Add high-value endpoints or workflows, areas to avoid, vulnerability classes to focus on, authentication quirks, session quirks, and known technologies.
Fine-Tune (Optional) > Duration: Control the maximum assessment duration and rate limit from the form.
Fine-Tune (Optional) > Artifacts: Attach pentest reports, documentation, or source-code archives. Only PDF, plain text, and archive (source code) file types are supported.

The orchestrator receives the crawler output, the configured users, the context, the scope, the duration budget, and the attached artifacts. It then coordinates workers and prioritizes work within those boundaries.

Vulnerability Categories¶

Findings are automatically classified into one of the following categories:

Category	Examples
XSS	Reflected, stored, and DOM-based cross-site scripting
SQL Injection	Error-based, union-based, blind, and time-based SQL injection
SSRF	Server-side request forgery, internal service access
Command Injection	OS command injection, remote code execution
Access Control	IDOR, privilege escalation, authentication bypass, broken authorization
Business Logic	Workflow manipulation, race conditions, state tampering

Requirements¶

Reachable target: The assessment must be able to reach the target URL
Supported assessments: Web applications, REST APIs, and GraphQL APIs
Authentication (optional): Configure when important surfaces are behind login

API assessments (REST / GraphQL)¶

For API assessments, the engine targets the API base URL (no browser workflow) and receives a schema artifact pre-seeded inside the sandbox workspace so the agents can enumerate the attack surface before probing it:

REST: the OpenAPI specification (when available through the assessment) is written to /workspace/schema.openapi.json. If no OpenAPI spec is attached, the known endpoint list is written to /workspace/schema.endpoints.json.
GraphQL: the queries / mutations / subscriptions metadata is written to /workspace/schema.operations.json, and the full SDL is written to /workspace/schema.graphql when available.

Limitations¶

Coverage depends on what the agents can discover and reach within the assessment timeout
The default assessment timeout is 6 hours and can be adjusted in the form up to 24 hours
Agents stay on the configured target domain and do not navigate to external sites
Context improves focus but does not replace assessment scope or authentication setup

How It Works: The Cascade workflow end to end
Graph Reasoning: How Cascade builds attack paths
Proof of Exploit: The evidence bundle attached to every finding
Legacy: The Previous Agent System: How the per-class agents map to Cascade
Authentication: Set up authentication for assessments

cascade multi-agent pentesting autonomous xss sqli ssrf command injection access control business logic