Skip to content

codeql-summarize

Actions
CodeQL Summarize Action
v0.2.0
Latest
Verified creator
Star (7)

Verified

GitHub has manually verified the creator of the action as an official partner organization. For more info see About badges in GitHub Marketplace.

CodeQL Summarize

⚠️ Early project – not an official GitHub / CodeQL product ⚠️

GitHub GitHub Actions GitHub Issues GitHub Stars License

Generate CodeQL Models-as-Data (MaD) summaries (sources, sinks, summaries) from existing CodeQL databases and export them in multiple formats suitable for:

  • Data extensions (YAML) for CodeQL packs
  • Customization libraries (.qll)
  • Bundled packs containing generated customizations
  • Raw JSON for further processing

Key Features

  • Automated download of CodeQL databases via the Code Scanning API (when a token is provided)
  • Multiple export formats: json, extensions, customizations, bundle
  • GitHub Action + GH CLI extension + direct CLI usage
  • Automatic language detection from database metadata (fallback to manual selection)
  • Caching support (skip with --disable-cache)
  • Supports (current): java, csharp

Supported Languages

Currently limited to the languages enforced in the code (CODEQL_LANGUAGES):

  • Java
  • C#

Requests / PRs to add more languages are welcome once the upstream model generator queries support them.

Quick Start

1. As a GitHub Action (recommended for automation)

- name: Generate CodeQL Summaries
  uses: advanced-security/[email protected]
  with:
    projects: ./projects.json
    token: ${{ secrets.CODEQL_SUMMARY_GENERATOR_TOKEN }}
    format: extensions
    output: ./generated

2. GitHub CLI Extension

gh extension install advanced-security/gh-codeql-summarize
gh codeql-summarize --help

Example:

gh codeql-summarize \
  --format bundle \
  --input examples/projects.json \
  --output ./examples

3. Manual / Local CLI

git clone https://github.com/advanced-security/codeql-summarize.git
cd codeql-summarize
pipenv install --dev  # or pip install -e . if a setup is added later
pipenv run python -m codeqlsummarize --help

Minimal invocation (using a local database + explicit language):

python -m codeqlsummarize \
  -db /path/to/codeql-db \
  -l java \
  -f json \
  -o ./out

Action Inputs

Input Description Default
project Single repository (owner/name) to summarize (none)
projects Path to a JSON file mapping language to list of repositories ./projects.json
language Comma-separated language list (overrides auto-detect) (auto)
format Export format: json, extensions, customizations, bundle extensions
output Output directory (or file for certain formats) ./
repository GitHub repository context (fallback for project) ${{ github.repository }}
token GitHub token used to download databases ${{ github.token }}

Notes:

  • To download CodeQL databases the token must have appropriate permissions (typically security_events:read / repo depending on visibility). A fine‑grained PAT with Code scanning read access is recommended.
  • If a database cannot be downloaded it will be skipped.

Project File Schema (projects.json)

Example (examples/projects.json):

{
  "java": ["ESAPI/esapi-java-legacy"]
}

Structure: <language> → array of <owner>/<repo> strings.

Export Formats

Format Description Output Shape
json Raw rows per model type One JSON file per database / summary (future enhancement)
extensions Data extensions YAML under a CodeQL pack structure Writes .yml under generated/ inside the detected pack
customizations Single .qll customization library aggregating models Requires -o <file>.qll
bundle Initializes / updates a CodeQL pack containing generated customizations Creates / updates pack in output dir

bundle will (if necessary) create a pack (e.g. java-summarize/) and generate per‑repository .qll files plus a Customizations.qll aggregator.

Environment Variables

Variable Purpose
GITHUB_TOKEN Default token for API calls (Actions)
GITHUB_REPOSITORY Default repo context (owner/name)
RUNNER_TEMP Temp directory root (Actions)
DEBUG If set (non-empty) enables debug logging

Exit / Error Behavior

The tool skips repositories whose databases cannot be fetched or located, logging warnings rather than stopping the entire run.

Typical Workflow (Action + Extensions Format)

  1. Maintain a projects.json file listing target repositories per language.
  2. Schedule a workflow (e.g. nightly) to regenerate models.
  3. Commit or publish the generated Data Extensions / Pack as needed.
  4. Consume generated models in downstream CodeQL analysis.

Development

Run tests:

pipenv run python -m unittest -v

Lint / format:

pipenv run black .

Contributing

See CONTRIBUTING.md. Please open an issue before large changes.

Security / Reporting Issues

See SECURITY.md.

Support

See SUPPORT.md. For general questions open a GitHub issue.

Limitations / Roadmap

  • Limited language set (Java, C#)
  • No parallel download throttling handling yet
  • No direct GitHub language detection fallback implemented
  • JSON exporter minimal (subject to enhancement)

License

Licensed under the MIT License – see LICENSE.

Acknowledgements

  • @GeekMasher – Author
  • @zbazztian – Major contributor

codeql-summarize is not certified by GitHub. It is provided by a third-party and is governed by separate terms of service, privacy policy, and support documentation.

About

CodeQL Summarize Action
v0.2.0
Latest

Verified

GitHub has manually verified the creator of the action as an official partner organization. For more info see About badges in GitHub Marketplace.

codeql-summarize is not certified by GitHub. It is provided by a third-party and is governed by separate terms of service, privacy policy, and support documentation.