Agentic Crawling
Agentic Crawling in WebApp DAST¶
Crawling web applications is a complex task that requires comprehension of actions, causality and chains between actions, input formats allowed, recovering from errors and more.
We have built special capabilities in the WebApp DAST engine to boost your application crawling coverage, by using LLM agents and giving them control on each page we detect.
Natural Language Crawling Instructions¶
Every web application has its quirks and specific business logic. You can directly influence the crawler's efficiency and success by guiding it with simple natural language instructions.
You can enable the agentic crawling feature via the following configuration:
frontend_dast:
agentic_crawling:
enabled: true
instructions: >
Do not change the user password. If you are logged out, log back in by
using user@example.com and the password is helloworld.
Make sure to search, create, delete objects on each page to fully test
each feature. If you are on the Escape Private Locations page, try creating a new private location named "hello world", and delete it.
This configuration will guide the LLM agent to help the scan-reauthenticate if it gets logged out, and also perform specific actions on a specific page you might want to target.
This will yield better API coverage by giving clear instructions and additional helpful information.
For example:
- a special employee ID you might need in a form.
- a specific user flow on a specific page (combine this with
hotstartto guide it even further!) - allowing a deletion action that the agent might avoid usually (its default instructions are to avoid destructive actions)
Reviewing results¶
You can review the agentic crawling logs by searching your scan logs, in the "Logs" tab.
Here you will be able to view reasoning, actions, screenshots during the scan.
Simply search for Agentic Page Crawler, or even better, use the "Stage" filter, by adding Agentic Actions for reviewing tool calls and clicks, interactions with the page, and Agentic Reasoning for the agent's reasoning and thinking during the crawling of the pages.
Figure 1: Agentic Page Crawler Logs
Figure 2: Agentic Page Crawler Reasoning
Figure 3: Agentic Page Crawler Reasoning
And you should see that the agent succeeded !
From a natural language instruction
Figure 4: Agentic Page Crawler Configuration
To a proof of actions performed, with screenshots
Figure 5: Agentic Page Crawler successfully executed the task
And a final output of the crawler that will summarize what was done.
Figure 6: Agentic Page Crawler final summary output