Skip to content

Commit d4f083a

Browse files
authored
README update and reshuffle
1 parent 5971542 commit d4f083a

File tree

1 file changed

+108
-110
lines changed

1 file changed

+108
-110
lines changed

README.md

Lines changed: 108 additions & 110 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Search collected SBOMs by PURL, cache them for offline analysis, sync malware se
1818
- Reason tracing: every search match shows which query matched; every malware match shows which advisory triggered it
1919
- Interactive REPL for ad‑hoc PURL queries (history, graceful Ctrl+C handling)
2020
- Optional progress bar while fetching SBOMs
21-
- Option to suppress secondary rate limit warnings, and full quiet mode to suppress non-error console output while retaining progress bar, human readable results and machine-readable JSON
21+
- Option to suppress secondary rate limit warnings, and full quiet mode to suppress informative messages
2222
- Intelligent skip logic: if the repository was pushed to, but the default branch head commit date isn't newer than the prior SBOM retrieval, the existing cached SBOM is reused
2323
- Adaptive backoff: each secondary rate limit hit increases the SBOM fetch delay by 10% to reduce future throttling
2424

@@ -51,6 +51,56 @@ Using GitHub Enterprise Server:
5151
npm run start -- --sync-sboms --enterprise ent --base-url https://github.internal/api/v3 --sbom-cache sboms --token $GHES_TOKEN
5252
```
5353

54+
### Argument Reference
55+
56+
| Arg | Purpose |
57+
|------|---------|
58+
| `--sbom-cache <dir>` | Directory holding per-repo SBOM JSON files (required for offline mode; used as write target when syncing) |
59+
| `--sync-sboms` | Perform API calls to (re)collect SBOMs; without it the CLI runs offline loading cached SBOMs. Requires a GitHub token |
60+
| `--enterprise <slug>` / `--org <login>` | Scope selection (mutually exclusive when syncing) |
61+
| `--purl <purl>` | Add a PURL/range/wildcard query (repeatable) |
62+
| `--purl-file <file>` | File with one query per line |
63+
| `--json` | Emit search JSON to stdout (unless overridden by `--output-file`) |
64+
| `--cli` | Also emit human-readable output when producing JSON (requires `--output-file`) |
65+
| `--output-file <file>` | Write search JSON payload to file; required when using both `--json` and `--cli` |
66+
| `--interactive` | Enter interactive search prompt after initial processing |
67+
| `--sync-malware` | Fetch & cache malware advisories (MALWARE classification). Requires a GitHub token |
68+
| `--match-malware` | Match current SBOM set against cached advisories |
69+
| `--malware-cache <dir>` | Advisory cache directory (required with malware operations) |
70+
| `--sarif-dir <dir>` | Write SARIF 2.1.0 files per repository (with malware matches) |
71+
| `--upload-sarif` | Upload generated SARIF to Code Scanning (requires --match-malware & --sarif-dir and a GitHub token) |
72+
| `--concurrency <n>` | Parallel SBOM fetches (default 5) |
73+
| `--sbom-delay <ms>` | Delay between SBOM fetch (dependency-graph/sbom) requests (default 5000) |
74+
| `--light-delay <ms>` | Delay between lightweight metadata calls (listing repos, commit head checks) (default 500) |
75+
| `--base-url <url>` | GitHub Enterprise Server REST base URL (ends with /api/v3) |
76+
| `--progress` | Show a dynamic progress bar during SBOM collection |
77+
| `--suppress-secondary-rate-limit-logs` | Hide secondary rate limit warning lines (automatically applied with `--progress`) |
78+
| `--quiet` | Suppress all non-error and non-result output (progress bar, JSON and human readable output still show) |
79+
80+
### Supplying PURL Queries from a File
81+
82+
Provide a file containing one or more PURL (or PURL + semver range) queries, one per line. Blank lines and lines starting with `#` are ignored.
83+
84+
Example file `queries.txt`:
85+
86+
```text
87+
# Exact PURL
88+
89+
90+
# Version range (semver caret)
91+
pkg:npm/chalk@^5.0.0
92+
93+
# Version range (inequalities)
94+
pkg:npm/chalk@>=5.0.0 <6.0.0
95+
96+
```
97+
98+
Run with (e.g. offline SBOMs):
99+
100+
```bash
101+
npm run start -- --sbom-cache sboms --purl-file queries.txt
102+
```
103+
54104
### SBOM Caching Workflow
55105

56106
1. First collection (populates cache progressively as it runs):
@@ -93,69 +143,74 @@ npm run start -- --sbom-cache sboms --malware-cache malware-cache --match-malwar
93143

94144
If you also perform a search in the same invocation (add `--purl` or `--purl-file`), the JSON file will contain both `malwareMatches` and `search` top-level keys.
95145

96-
### SARIF Output & Code Scanning Upload
146+
### Progress bar & log noise suppression
97147

98-
Generate SARIF 2.1.0 files (one per repository with matches) for malware findings:
148+
When collecting a large number of SBOMs you can enable a lightweight progress bar:
99149

100150
```bash
101-
npm run start -- --sbom-cache sboms --malware-cache malware-cache --match-malware --sarif-dir sarif-out
151+
npm run start -- --sync-sboms --org my-org --sbom-cache sboms --progress
102152
```
103153

104-
Each file is named `<owner>_<repo>.sarif` and contains rules (one per advisory GHSA) and results (one per matched package).
154+
Secondary rate limit warnings (which can visually disrupt the bar) are automatically silenced.
105155

106-
Upload those SARIF files to GitHub Code Scanning (creates alerts in each affected repository):
107-
108-
```bash
109-
npm run start -- --sbom-cache sboms --malware-cache malware-cache \
110-
--match-malware --sarif-dir sarif-out --upload-sarif --token $GITHUB_TOKEN
111-
```
156+
Behaviour details:
112157

113-
Notes:
158+
- The bar shows overall completion across all organizations (if using `--enterprise`) once repository counts are enumerated
159+
- Rendering is throttled (~12 fps) to avoid excessive stdout writes
160+
- Standard error messages (e.g., hard failures) still appear
161+
- Suppression only hides the secondary rate-limit informational warnings; primary rate limit retries still log once
114162

115-
- `--upload-sarif` requires `--sarif-dir` and `--match-malware`.
116-
- A token with `security_events` (and appropriate repo/org scope) is required for uploads.
117-
- The tool attempts to resolve the default branch commit SHA for each repo; if it cannot, that repo's upload is skipped.
118-
- SARIF upload merges are handled by GitHub; repeated uploads for the same commit replace earlier results for the same tool.
163+
To reduce general log noise, you can use either `--quiet` to suppress non-error console output while retaining progress bar, human readable results and machine-readable JSON, or just `--suppress-secondary-rate-limit-logs` to suppress warnings of hitting the rate limits.
119164

120-
### Progress Bar & Log Noise Suppression
165+
### Output modes
121166

122-
When collecting a large number of SBOMs you can enable a lightweight progress bar:
167+
JSON only to stdout:
123168

124169
```bash
125-
npm run start -- --sync-sboms --org my-org --sbom-cache sboms --progress
170+
npm run start -- --sbom-cache sboms --purl pkg:npm/[email protected] --json
126171
```
127172

128-
If you routinely encounter secondary rate limit warnings (which can visually disrupt the bar) you can silence those specific warnings:
173+
Human + JSON (JSON written to file; stdout remains readable):
129174

130175
```bash
131-
npm run start -- --sync-sboms --org my-org --sbom-cache sboms --progress --suppress-secondary-rate-limit-logs
176+
npm run start -- --sbom-cache sboms --purl pkg:npm/[email protected] \
177+
--json --cli --output-file search-results.json
132178
```
133179

134-
Behaviour details:
180+
If you specify `--cli --json`, you must also supply `--output-file` to avoid corrupted mixed stdout.
181+
182+
Output lines and JSON output append a reason context:
135183

136-
- The bar shows overall completion across all organizations (if using `--enterprise`) once repository counts are enumerated.
137-
- Rendering is throttled (~12 fps) to avoid excessive stdout writes.
138-
- Standard error messages (e.g., hard failures) still appear.
139-
- Suppression only hides the secondary rate-limit informational warnings; primary rate limit retries still log once.
184+
- Search matches: `{query: <original query string>}`
185+
- Malware matches: `{advisory: <GHSA-ID>}`
140186

141-
### Output Modes (Search Results)
187+
This makes it clear which input (user query or specific advisory) caused each result.
142188

143-
JSON only to stdout:
189+
#### SARIF Output & Code Scanning Upload
190+
191+
Generate SARIF 2.1.0 files (one per repository with matches) for malware matches:
144192

145193
```bash
146-
npm run start -- --sbom-cache sboms --purl pkg:npm/[email protected] --json
194+
npm run start -- --sbom-cache sboms --malware-cache malware-cache --match-malware --sarif-dir sarif-out
147195
```
148196

149-
Human + JSON (JSON written to file; stdout remains readable):
197+
Each file is named `<owner>_<repo>.sarif` and contains rules (one per advisory GHSA) and results (one per matched package).
198+
199+
Upload those SARIF files to GitHub Code Scanning (creates alerts in each affected repository):
150200

151201
```bash
152-
npm run start -- --sbom-cache sboms --purl pkg:npm/[email protected] \
153-
--json --cli --output-file search-results.json
202+
npm run start -- --sbom-cache sboms --malware-cache malware-cache \
203+
--match-malware --sarif-dir sarif-out --upload-sarif --token $GITHUB_TOKEN
154204
```
155205

156-
If you specify `--cli --json`, you must also supply `--output-file` to avoid corrupted mixed stdout.
206+
Notes:
207+
208+
- `--upload-sarif` requires `--sarif-dir` and `--match-malware`
209+
- A token with `security_events` (and appropriate repo/org scope) is required for uploads
210+
- The tool attempts to resolve the default branch commit SHA for each repo; if it cannot, that repo's upload is skipped
211+
- SARIF upload merges are handled by GitHub; repeated uploads for the same commit replace earlier results for the same tool
157212

158-
### Interactive Mode
213+
### Interactive mode
159214

160215
Enter an interactive prompt (arrow key history, Ctrl+C handling) after initial collection/load:
161216

@@ -165,7 +220,16 @@ npm run start -- --sbom-cache sboms --interactive
165220

166221
Then type one PURL query per line. Entering a blank line or using Ctrl+C on a blank line exits. Ctrl+C on a non-blank line clears the line.
167222

168-
### Offline Fixture Test
223+
## Build & test
224+
225+
## Build
226+
227+
```bash
228+
npm install
229+
npm run build
230+
```
231+
232+
## Test
169233

170234
The repo ships with a minimal test fixture to validate end-to-end malware matching without making network calls.
171235

@@ -195,89 +259,23 @@ Alternatively, you can exercise the CLI purely offline using the fixtures (no to
195259
npm run start -- --sbom-cache fixtures/sboms --malware-cache fixtures/malware-cache --match-malware
196260
```
197261

198-
## Build
199-
200-
```bash
201-
npm install
202-
npm run build
203-
```
204-
205-
## Notes
206-
207-
### Supplying PURL Queries from a File
208-
209-
Provide a file containing one or more PURL (or PURL + semver range) queries, one per line. Blank lines and lines starting with `#` are ignored.
210-
211-
Example file `queries.txt`:
212-
213-
```text
214-
# Exact PURL
215-
216-
217-
# Version range (semver caret)
218-
pkg:npm/chalk@^5.0.0
219-
220-
# Version range (inequalities)
221-
pkg:npm/chalk@>=5.0.0 <6.0.0
222-
223-
```
224-
225-
Run with (offline):
226-
227-
```bash
228-
npm run start -- --sbom-cache sboms --purl-file queries.txt
229-
```
230-
231-
Or (fresh sync + file-based queries):
232-
233-
```bash
234-
npm run start -- --sync-sboms --org my-org --sbom-cache sboms --purl-file queries.txt
235-
```
236-
237-
### Argument Reference
238-
239-
| Arg | Purpose |
240-
|------|---------|
241-
| `--sbom-cache <dir>` | Directory holding per-repo SBOM JSON files (required for offline mode; used as write target when syncing) |
242-
| `--sync-sboms` | Perform API calls to (re)collect SBOMs; without it the CLI runs offline loading cached SBOMs. Requires a GitHub token |
243-
| `--enterprise <slug>` / `--org <login>` | Scope selection (mutually exclusive when syncing) |
244-
| `--purl <purl>` | Add a PURL/range/wildcard query (repeatable) |
245-
| `--purl-file <file>` | File with one query per line |
246-
| `--json` | Emit search JSON to stdout (unless overridden by `--output-file`) |
247-
| `--cli` | Also emit human-readable output when producing JSON (requires `--output-file`) |
248-
| `--output-file <file>` | Write search JSON payload to file; required when using both `--json` and `--cli` |
249-
| `--interactive` | Enter interactive search prompt after initial processing |
250-
| `--sync-malware` | Fetch & cache malware advisories (MALWARE classification). Requires a GitHub token |
251-
| `--match-malware` | Match current SBOM set against cached advisories |
252-
| `--malware-cache <dir>` | Advisory cache directory (required with malware operations) |
253-
| `--sarif-dir <dir>` | Write SARIF 2.1.0 files per repository (with malware matches) |
254-
| `--upload-sarif` | Upload generated SARIF to Code Scanning (requires --match-malware & --sarif-dir and a GitHub token) |
255-
| `--concurrency <n>` | Parallel SBOM fetches (default 5) |
256-
| `--sbom-delay <ms>` | Delay between SBOM fetch (dependency-graph/sbom) requests (default 5000) |
257-
| `--light-delay <ms>` | Delay between lightweight metadata calls (listing repos, commit head checks) (default 500) |
258-
| `--base-url <url>` | GitHub Enterprise Server REST base URL (ends with /api/v3) |
259-
| `--progress` | Show a dynamic progress bar during SBOM collection |
260-
| `--suppress-secondary-rate-limit-logs` | Hide secondary rate limit warning lines (useful with `--progress`) |
261-
| `--quiet` | Suppress all non-error and non-result output (progress bar, JSON and human readable output still show) |
262+
## Authentication and Rate Limiting
262263

263-
### Reason Tracing
264+
### Rate Limiting & Retries
264265

265-
Output lines append a reason context:
266+
Standard & secondary rate limits automatically retried (up to 2 times).
266267

267-
- Search matches: `{query: <original query string>}`
268-
- Malware matches: `{advisory: <GHSA-ID>}`
268+
You can tune concurrency and increase the delay to reduce the chance of hitting rate limits.
269269

270-
This makes it clear which input (user query or specific advisory) caused each result.
270+
Each time a secondary rate limit is hit, the delay between fetching SBOMs is increased by 10%, to provide a way to adaptively respond to that rate limit.
271271

272-
### Rate Limiting & Retries
272+
### Authentication
273273

274-
- Standard & secondary rate limits automatically retried (up to 2 times)
275-
- You can tune concurrency and increase the delay to reduce the chance of hitting rate limits
274+
A GitHub token with appropriate scope is required when performing network operations such as `--sync-sboms`, `--sync-malware` and `--upload-sarif`.
276275

277-
### Authentication Notes
276+
It can be provided in the `GITHUB_TOKEN` environment variable, or with the `--token` argument.
278277

279-
- A GitHub token is required when performing network operations such as `--sync-sboms`, `--sync-malware` and `--upload-sarif`
280-
- Offline operations (pure searches, matches using pre-cached data) need no token
278+
Offline operations (pure searches, matches using pre-cached data) need no token.
281279

282280
## License
283281

0 commit comments

Comments
 (0)