|
3 | 3 |
|
4 | 4 | <h1>CodeQL Summarize</h1>
|
5 | 5 |
|
6 |
| -:warning: **This project is in early development and is not supported by GitHub or CodeQL** :warning: |
| 6 | +:warning: <strong>Early project – not an official GitHub / CodeQL product</strong> :warning: |
7 | 7 |
|
8 | 8 | [](https://github.com/advanced-security/codeql-summarize)
|
9 | 9 | [](https://github.com/advanced-security/codeql-summarize/actions/workflows/publish.yml?query=branch%3Amain)
|
|
14 | 14 | </div>
|
15 | 15 | <!-- markdownlint-restore -->
|
16 | 16 |
|
| 17 | +Generate CodeQL Models-as-Data (MaD) summaries (sources, sinks, summaries) from existing CodeQL databases and export them in multiple formats suitable for: |
17 | 18 |
|
18 |
| -This is the GitHub CodeQL Summarize project and Actions which allows users to generate Models as Data (MaD) from CodeQL databases. |
| 19 | +- Data extensions (YAML) for CodeQL packs |
| 20 | +- Customization libraries (`.qll`) |
| 21 | +- Bundled packs containing generated customizations |
| 22 | +- Raw JSON for further processing |
19 | 23 |
|
20 |
| -## Run |
| 24 | +## Key Features |
21 | 25 |
|
22 |
| -### Actions |
| 26 | +- Automated download of CodeQL databases via the Code Scanning API (when a token is provided) |
| 27 | +- Multiple export formats: `json`, `extensions`, `customizations`, `bundle` |
| 28 | +- GitHub Action + GH CLI extension + direct CLI usage |
| 29 | +- Automatic language detection from database metadata (fallback to manual selection) |
| 30 | +- Caching support (skip with `--disable-cache`) |
| 31 | +- Supports (current): `java`, `csharp` |
23 | 32 |
|
24 |
| -The main use case for `codeqlsummarize` is to run it as an Action so the purposes of automating this process. |
| 33 | +## Supported Languages |
| 34 | + |
| 35 | +Currently limited to the languages enforced in the code (`CODEQL_LANGUAGES`): |
| 36 | + |
| 37 | +- Java |
| 38 | +- C# |
| 39 | + |
| 40 | +> Requests / PRs to add more languages are welcome once the upstream model generator queries support them. |
| 41 | +
|
| 42 | +## Quick Start |
| 43 | + |
| 44 | +### 1. As a GitHub Action (recommended for automation) |
25 | 45 |
|
26 | 46 | ```yml
|
27 | 47 | - name: Generate CodeQL Summaries
|
28 |
| - uses: advanced-security/codeql-summarize@v1 |
| 48 | + uses: advanced-security/codeql-summarize@v0.2.0 |
29 | 49 | with:
|
30 |
| - # This file defines the projects you want to make sure to get the latest and greatest |
31 |
| - # summaries from. |
32 | 50 | projects: ./projects.json
|
33 |
| - # Token needs access to download the CodeQL databases you want to create summaries for |
34 | 51 | token: ${{ secrets.CODEQL_SUMMARY_GENERATOR_TOKEN }}
|
| 52 | + format: extensions |
| 53 | + output: ./generated |
35 | 54 | ```
|
36 | 55 |
|
37 |
| -### GH CLI |
38 |
| -
|
39 |
| -You can install this tool as part of the GitHub CLI using the following commands: |
| 56 | +### 2. GitHub CLI Extension |
40 | 57 |
|
41 | 58 | ```bash
|
42 |
| -gh extensions install advanced-security/gh-codeql-summarize |
| 59 | +gh extension install advanced-security/gh-codeql-summarize |
43 | 60 | gh codeql-summarize --help
|
44 | 61 | ```
|
45 | 62 |
|
46 |
| -### Manual Command Line |
| 63 | +Example: |
47 | 64 |
|
48 | 65 | ```bash
|
49 |
| -git clone https://github.com/advanced-security/gh-codeql-summarize.git && cd gh-codeql-summarize |
50 |
| -python3 -m codeqlsummarize --help |
| 66 | +gh codeql-summarize \ |
| 67 | + --format bundle \ |
| 68 | + --input examples/projects.json \ |
| 69 | + --output ./examples |
51 | 70 | ```
|
52 | 71 |
|
53 |
| -## License |
| 72 | +### 3. Manual / Local CLI |
| 73 | + |
| 74 | +```bash |
| 75 | +git clone https://github.com/advanced-security/codeql-summarize.git |
| 76 | +cd codeql-summarize |
| 77 | +pipenv install --dev # or pip install -e . if a setup is added later |
| 78 | +pipenv run python -m codeqlsummarize --help |
| 79 | +``` |
| 80 | + |
| 81 | +Minimal invocation (using a local database + explicit language): |
| 82 | + |
| 83 | +```bash |
| 84 | +python -m codeqlsummarize \ |
| 85 | + -db /path/to/codeql-db \ |
| 86 | + -l java \ |
| 87 | + -f json \ |
| 88 | + -o ./out |
| 89 | +``` |
| 90 | + |
| 91 | +## Action Inputs |
| 92 | + |
| 93 | +| Input | Description | Default | |
| 94 | +| ------------ | --------------------------------------------------------------- | -------------------------- | |
| 95 | +| `project` | Single repository (owner/name) to summarize | (none) | |
| 96 | +| `projects` | Path to a JSON file mapping language to list of repositories | `./projects.json` | |
| 97 | +| `language` | Comma-separated language list (overrides auto-detect) | (auto) | |
| 98 | +| `format` | Export format: `json`, `extensions`, `customizations`, `bundle` | `extensions` | |
| 99 | +| `output` | Output directory (or file for certain formats) | `./` | |
| 100 | +| `repository` | GitHub repository context (fallback for `project`) | `${{ github.repository }}` | |
| 101 | +| `token` | GitHub token used to download databases | `${{ github.token }}` | |
| 102 | + |
| 103 | +Notes: |
| 104 | + |
| 105 | +- To download CodeQL databases the token must have appropriate permissions (typically `security_events:read` / `repo` depending on visibility). A fine‑grained PAT with Code scanning read access is recommended. |
| 106 | +- If a database cannot be downloaded it will be skipped. |
| 107 | + |
| 108 | +## Project File Schema (`projects.json`) |
| 109 | + |
| 110 | +Example (`examples/projects.json`): |
| 111 | + |
| 112 | +```json |
| 113 | +{ |
| 114 | + "java": ["ESAPI/esapi-java-legacy"] |
| 115 | +} |
| 116 | +``` |
| 117 | + |
| 118 | +Structure: `<language>` → array of `<owner>/<repo>` strings. |
| 119 | + |
| 120 | +## Export Formats |
54 | 121 |
|
55 |
| -This project is licensed under the terms of the MIT open source license. Please refer to [MIT](./LICENSE.txt) for the full terms. |
| 122 | +| Format | Description | Output Shape | |
| 123 | +| ---------------- | ----------------------------------------------------------------------- | --------------------------------------------------------- | |
| 124 | +| `json` | Raw rows per model type | One JSON file per database / summary (future enhancement) | |
| 125 | +| `extensions` | Data extensions YAML under a CodeQL pack structure | Writes `.yml` under `generated/` inside the detected pack | |
| 126 | +| `customizations` | Single `.qll` customization library aggregating models | Requires `-o <file>.qll` | |
| 127 | +| `bundle` | Initializes / updates a CodeQL pack containing generated customizations | Creates / updates pack in output dir | |
56 | 128 |
|
57 |
| -## Maintainers |
| 129 | +`bundle` will (if necessary) create a pack (e.g. `java-summarize/`) and generate per‑repository `.qll` files plus a `Customizations.qll` aggregator. |
58 | 130 |
|
59 |
| -[CODEOWNERS](./.github/CODEOWNERS) file. |
| 131 | +## Environment Variables |
| 132 | + |
| 133 | +| Variable | Purpose | |
| 134 | +| ------------------- | ---------------------------------------- | |
| 135 | +| `GITHUB_TOKEN` | Default token for API calls (Actions) | |
| 136 | +| `GITHUB_REPOSITORY` | Default repo context (owner/name) | |
| 137 | +| `RUNNER_TEMP` | Temp directory root (Actions) | |
| 138 | +| `DEBUG` | If set (non-empty) enables debug logging | |
| 139 | + |
| 140 | +## Exit / Error Behavior |
| 141 | + |
| 142 | +The tool skips repositories whose databases cannot be fetched or located, logging warnings rather than stopping the entire run. |
| 143 | + |
| 144 | +## Typical Workflow (Action + Extensions Format) |
| 145 | + |
| 146 | +1. Maintain a `projects.json` file listing target repositories per language. |
| 147 | +2. Schedule a workflow (e.g. nightly) to regenerate models. |
| 148 | +3. Commit or publish the generated Data Extensions / Pack as needed. |
| 149 | +4. Consume generated models in downstream CodeQL analysis. |
| 150 | + |
| 151 | +## Development |
| 152 | + |
| 153 | +Run tests: |
| 154 | + |
| 155 | +```bash |
| 156 | +pipenv run python -m unittest -v |
| 157 | +``` |
| 158 | + |
| 159 | +Lint / format: |
| 160 | + |
| 161 | +```bash |
| 162 | +pipenv run black . |
| 163 | +``` |
| 164 | + |
| 165 | +## Contributing |
| 166 | + |
| 167 | +See [CONTRIBUTING.md](./CONTRIBUTING.md). Please open an issue before large changes. |
| 168 | + |
| 169 | +## Security / Reporting Issues |
| 170 | + |
| 171 | +See [SECURITY.md](./SECURITY.md). |
60 | 172 |
|
61 | 173 | ## Support
|
62 | 174 |
|
63 |
| -Please create issues for any feature requests, bugs, or documentation problems. |
| 175 | +See [SUPPORT.md](./SUPPORT.md). For general questions open a GitHub issue. |
| 176 | + |
| 177 | +## Limitations / Roadmap |
| 178 | + |
| 179 | +- Limited language set (Java, C#) |
| 180 | +- No parallel download throttling handling yet |
| 181 | +- No direct GitHub language detection fallback implemented |
| 182 | +- JSON exporter minimal (subject to enhancement) |
| 183 | + |
| 184 | +## License |
| 185 | + |
| 186 | +Licensed under the MIT License – see [LICENSE](./LICENSE). |
64 | 187 |
|
65 |
| -## Acknowledgement |
| 188 | +## Acknowledgements |
66 | 189 |
|
67 |
| -- @GeekMasher - Author |
68 |
| -- @zbazztian - Major contributor |
| 190 | +- @GeekMasher – Author |
| 191 | +- @zbazztian – Major contributor |
0 commit comments