API Discovery from Code and Automated Schema Generation¶
Escape reconstructs API schemas by parsing the Abstract Syntax Tree (AST) of both frontend and backend source code. This enables accurate reconstruction of API structures, endpoints, and expected parameters, particularly beneficial for REST APIs with OpenAPI specifications.
Escape not only detects API Endpoints from Code and generates API Schemas, but also continuously monitors for and detects any changes or versions in the API schema over time. This capability allows teams to track API evolution and ensure all changes are documented and understood, reducing the risk of inconsistencies or integration issues.
How It Works¶
API Discovery and Automated Schema Generation utilize proprietary technology leveraging Graph Theory and Generative AI. The process consists of three main steps:
-
Creation of a single-file Abstract Syntax Tree (AST) representation of the entire codebase by resolving imports and dependencies. This enables detection of API calls, including those with variables and parameters that may contain sensitive data or secrets.
-
Analysis by a pre-trained open-source LLM to reconstruct an accurate representation of the API Schema behind the frontend.
-
Linking of generated API Schemas with discovered API Services in the overall flow, using API Endpoints from the Schema as inputs for brute-forcing in the Escape API Inventory.
Schema of the API Discovery from Code and Schema Generation
Inputs¶
To get started, API Discovery from Code and Automated API Schema Generation only require a single domain name. In that case, Escape has to preprocess publicly exposed Frontends and SPAs that are often bundled and minified (via Webpack for example) in order to be able to parse their AST and resolve their dependencies properly.
Complementarily, leveraging Git integrations enhances Escape’s capabilities by providing access to raw frontend and backend code. This access is powerful for creating a complete view of an API that may be under development or only partially exposed to the Internet.
Exposed Secrets and Sensitive Data Detection¶
By having access to the AST of publicly-exposed frontend, Escape is able to examine the context in which a potential Secret is found, considering factors like the surrounding code — including API calls, data flow, and access patterns.
Application in DAST¶
Once generated, these schemas are seamlessly integrated into the DAST process. Users can initiate dynamic application security testing with a simple click, employing the latest schema versions to ensure thorough and accurate testing coverage.
How do we connect the generated OpenAPI Schema to the API Service found in our inventory?¶
While repositories may provide all of an API’s endpoints, they might not include the base URL. The key step is using a feedback loop: after generating endpoints from the OpenAPI paths, we feed them back into the initial stage of the Escape Inventory process. This helps us uncover additional endpoints we might have missed earlier. We then compare these Endpoints to the Services that our technology has already detected. If many endpoints from the OpenAPI Schema match those of a discovered API Service, it’s likely that the two should be linked.
Examples¶
Here is an example of OpenAPI specification fully automatically generated by API Schema Generation feature.
- Repository: https://github.com/veracode/verademo
- Generated OpenAPI Schema: https://drive.google.com/file/d/13Axc4kN6ihVj9d_slHMx7FKU_6WfxmVw/
Handling GraphQL APIs¶
For GraphQL APIs, where the schema is inherently defined and exposed by the nature of the technology, Escape ensures that the schema is always current and reflects the latest API structure. This is crucial for maintaining the effectiveness of security and compliance checks.