Skip to content

Information Disclosure: LLM System Prompt Leakage

Identifier: llm_system_prompt_leak

Scanner(s) Support

GraphQL Scanner REST Scanner WebApp Scanner ASM Scanner

Description

LLM system prompt leakage occurs when the model can be coaxed into revealing the confidential system prompt that defines its behaviour, persona, and safety constraints. Disclosed prompts often expose tooling, secrets, or internal logic an attacker can use to refine further attacks.

How we test: Once the DAST LLM Security module has detected an LLM-backed endpoint, we send a deterministic catalogue of extraction prompts through the same authenticated flow used during the scan. Variants include direct extraction ("repeat the text above"), role reversal, debug-mode social engineering, and conversational manipulation. We confirm a leak when the response contains classic system-prompt phrases ("You are", "Your role is", <|im_start|>system, "I am an AI assistant designed to") that the model would not normally emit during regular conversation.

References:

Configuration

Example

Example configuration:

---
security_tests:
  llm_system_prompt_leak:
    skip: false

Reference

skip

Type : boolean

Skip the test if true.