Skip to content

Injection: LLM JailBreak

Identifier: llm_jail_break

Scanner(s) Support

GraphQL Scanner REST Scanner WebApp Scanner

Description

Basically, jailbreaking is when someone figures out a way to trick a large language model into doing things it shouldn't do. The vulnerability comes from crafty inputs that bypass built-in restrictions, allowing the model to generate harmful or unintended content. This is dangerous because it can lead to misuselike spreading misinformation or even aiding in cyberattacksand it undermines the safety measures you rely on when deploying these models. Developers should watch out for ineffective input validations and overly trusting model safeguards, ensuring they thoroughly test against manipulative inputs.

References:

Configuration

Example

Example configuration:

---
security_tests:
  llm_jail_break:
    assets_allowed:
    - REST
    - GRAPHQL
    - WEBAPP
    skip: false

Reference

assets_allowed

Type : List[AssetType]*

List of assets that this check will cover.

skip

Type : boolean

Skip the test if true.