Skip to content
This repository was archived by the owner on Jan 22, 2026. It is now read-only.

Commit 356ae84

Browse files
committed
Add SBOM export functionality
- Introduced `git pkgs sbom` command to export dependencies as a Software Bill of Materials (SBOM). - Supported output formats: CycloneDX (default) and SPDX, with options for JSON and XML. - Enhanced dependency enrichment to include package URLs, versions, licenses, and integrity hashes. - Updated documentation to reflect new SBOM command and its options. - Modified package and version models to store supplier information and enrich from external API. - Improved license enrichment logic to prioritize version-level licenses over package-level. - Added tests for SBOM command, including various output formats and enrichment scenarios. - Updated database schema to accommodate new supplier fields.
1 parent 6c4c84c commit 356ae84

File tree

18 files changed

+939
-30
lines changed

18 files changed

+939
-30
lines changed

CHANGELOG.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,17 @@
11
## [Unreleased]
22

3+
## [0.9.0] - 2026-01-14
4+
5+
- `git pkgs sbom` command to export dependencies as SPDX or CycloneDX
36
- `git pkgs integrity` command to show and verify lockfile integrity hashes
7+
- Parse go.sum for Go module integrity hashes (no longer ignored)
8+
- Convert Go h1: hashes (base64) to hex for SBOM compatibility
49
- `--drift` flag to detect packages with different hashes for the same version
510
- Registry integrity comparison via ecosyste.ms API
6-
- Store integrity hashes from lockfiles in dependency_snapshots table (schema v4, run `git pkgs upgrade`)
11+
- Store integrity hashes from lockfiles in dependency_snapshots table
12+
- SBOM export includes supplier info from ecosyste.ms (owner/maintainer)
13+
- License commands use version-level license data when available
14+
- Store supplier_name and supplier_type on packages (schema v5, run `git pkgs upgrade`)
715
- Update ecosystems-bibliothecary to ~> 15.3 (integrity extraction from lockfiles)
816
- Update purl to >= 1.7.1 (ecosyste.ms API URL support)
917

Gemfile.lock

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,13 @@ GIT
77
PATH
88
remote: .
99
specs:
10-
git-pkgs (0.8.0)
10+
git-pkgs (0.9.0)
11+
base64
1112
ecosystems-bibliothecary (~> 15.3)
1213
purl (~> 1.7, >= 1.7.1)
1314
rugged (~> 1.0)
1415
sarif-ruby
16+
sbom (~> 0.4)
1517
sequel (>= 5.0)
1618
sqlite3 (>= 2.0)
1719
vers (~> 1.0)
@@ -21,6 +23,7 @@ GEM
2123
specs:
2224
addressable (2.8.8)
2325
public_suffix (>= 2.0.2, < 8.0)
26+
base64 (0.3.0)
2427
benchmark (0.5.0)
2528
bigdecimal (4.0.1)
2629
crack (1.0.1)
@@ -76,6 +79,10 @@ GEM
7679
io-console (~> 0.5)
7780
rexml (3.4.4)
7881
rugged (1.9.0)
82+
sbom (0.4.1)
83+
json_schemer (~> 2.0)
84+
purl (~> 1.6)
85+
rexml (~> 3.2)
7986
sequel (5.100.0)
8087
bigdecimal
8188
simplecov (0.22.0)
@@ -130,6 +137,7 @@ DEPENDENCIES
130137

131138
CHECKSUMS
132139
addressable (2.8.8) sha256=7c13b8f9536cf6364c03b9d417c19986019e28f7c00ac8132da4eb0fe393b057
140+
base64 (0.3.0) sha256=27337aeabad6ffae05c265c450490628ef3ebd4b67be58257393227588f5a97b
133141
benchmark (0.5.0) sha256=465df122341aedcb81a2a24b4d3bd19b6c67c1530713fd533f3ff034e419236c
134142
bigdecimal (4.0.1) sha256=8b07d3d065a9f921c80ceaea7c9d4ae596697295b584c296fe599dd0ad01c4a7
135143
crack (1.0.1) sha256=ff4a10390cd31d66440b7524eb1841874db86201d5b70032028553130b6d4c7e
@@ -138,7 +146,7 @@ CHECKSUMS
138146
docile (1.4.1) sha256=96159be799bfa73cdb721b840e9802126e4e03dfc26863db73647204c727f21e
139147
ecosystems-bibliothecary (15.3.0) sha256=dc3c8caa3218bf833beba9e3eb8cbeb35987f771532ff0eab87fbcdfb30ce4eb
140148
erb (6.0.1) sha256=28ecdd99c5472aebd5674d6061e3c6b0a45c049578b071e5a52c2a7f13c197e5
141-
git-pkgs (0.8.0)
149+
git-pkgs (0.9.0)
142150
hana (1.3.7) sha256=5425db42d651fea08859811c29d20446f16af196308162894db208cac5ce9b0d
143151
hashdiff (1.2.1) sha256=9c079dbc513dfc8833ab59c0c2d8f230fa28499cc5efb4b8dd276cf931457cd1
144152
io-console (0.8.2) sha256=d6e3ae7a7cc7574f4b8893b4fca2162e57a825b223a177b7afa236c5ef9814cc
@@ -162,6 +170,7 @@ CHECKSUMS
162170
rexml (3.4.4) sha256=19e0a2c3425dfbf2d4fc1189747bdb2f849b6c5e74180401b15734bc97b5d142
163171
rugged (1.9.0) sha256=7faaa912c5888d6e348d20fa31209b6409f1574346b1b80e309dbc7e8d63efac
164172
sarif-ruby (0.1.0)
173+
sbom (0.4.1) sha256=0c0cb49c43048f53c7eacaa536064704747825f3ba6b607dfe481aab9c591f37
165174
sequel (5.100.0) sha256=cb0329b62287a01db68eead46759c14497a3fae01b174e2c41da108a9e9b4a12
166175
simplecov (0.22.0) sha256=fe2622c7834ff23b98066bb0a854284b2729a569ac659f82621fc22ef36213a5
167176
simplecov-html (0.13.2) sha256=bd0b8e54e7c2d7685927e8d6286466359b6f16b18cb0df47b508e8d73c777246

README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -335,6 +335,20 @@ git pkgs integrity --stateless # no database needed
335335

336336
The `--drift` flag scans your history for packages where the same version has different integrity hashes, which could indicate a supply chain issue.
337337

338+
### SBOM export
339+
340+
Export dependencies as a Software Bill of Materials in CycloneDX or SPDX format:
341+
342+
```bash
343+
git pkgs sbom # CycloneDX JSON (default)
344+
git pkgs sbom --type spdx # SPDX JSON
345+
git pkgs sbom -f xml # XML instead of JSON
346+
git pkgs sbom --name my-project # custom project name
347+
git pkgs sbom --stateless # no database needed
348+
```
349+
350+
Includes package URLs (purls), versions, and licenses (fetched from registries). Use `--skip-enrichment` to omit license lookups.
351+
338352
### Diff between commits
339353

340354
```bash

docs/enrichment.md

Lines changed: 63 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Most git-pkgs commands work entirely from your git history. Your manifests and lockfiles tell us which packages you depend on, who added them, and when. But some questions require data that isn't in your repository: what's the latest version available? what license does this package use? has a security vulnerability been disclosed?
44

5-
The `outdated` and `licenses` commands fetch this external metadata from the [ecosyste.ms Packages API](https://packages.ecosyste.ms/), which aggregates data from npm, RubyGems, PyPI, and other registries. See also [vulns.md](vulns.md) for vulnerability scanning via OSV.
5+
The `outdated`, `licenses`, and `sbom` commands fetch this external metadata from the [ecosyste.ms Packages API](https://packages.ecosyste.ms/), which aggregates data from npm, RubyGems, PyPI, and other registries. See also [vulns.md](vulns.md) for vulnerability scanning via OSV.
66

77
## outdated
88

@@ -152,9 +152,68 @@ MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, Unlicense, CC0-1.0, 0BSD, WTFP
152152
Copyleft licenses (flagged with `--copyleft` or `--permissive`):
153153
GPL-2.0, GPL-3.0, LGPL-2.1, LGPL-3.0, AGPL-3.0, MPL-2.0 (and their variant identifiers)
154154

155+
## sbom
156+
157+
Export dependencies as a Software Bill of Materials (SBOM) in SPDX or CycloneDX format.
158+
159+
```
160+
$ git pkgs sbom
161+
{
162+
"spdxVersion": "SPDX-2.3",
163+
"name": "my-project",
164+
"packages": [
165+
{
166+
"name": "lodash",
167+
"versionInfo": "4.17.21",
168+
"licenseConcluded": "MIT",
169+
"externalRefs": [
170+
{
171+
"referenceType": "purl",
172+
"referenceLocator": "pkg:npm/lodash@4.17.21"
173+
}
174+
]
175+
}
176+
]
177+
}
178+
```
179+
180+
### Options
181+
182+
```
183+
-t, --type=TYPE SBOM type: cyclonedx (default) or spdx
184+
-f, --format=FORMAT Output format: json (default) or xml
185+
-n, --name=NAME Project name (default: repository directory name)
186+
-e, --ecosystem=NAME Filter by ecosystem
187+
-r, --ref=REF Git ref to export (default: HEAD)
188+
--skip-enrichment Skip fetching license data from registries
189+
--stateless Parse manifests directly without database
190+
```
191+
192+
### Examples
193+
194+
CycloneDX format:
195+
196+
```
197+
$ git pkgs sbom --type cyclonedx
198+
```
199+
200+
XML output:
201+
202+
```
203+
$ git pkgs sbom -f xml
204+
```
205+
206+
Skip license enrichment for faster output:
207+
208+
```
209+
$ git pkgs sbom --skip-enrichment
210+
```
211+
212+
The SBOM includes package URLs (purls), versions, licenses (from registry lookup), and integrity hashes (from lockfiles when available).
213+
155214
## Data Source
156215
157-
Both commands fetch package metadata from [ecosyste.ms](https://packages.ecosyste.ms/), which aggregates data from npm, RubyGems, PyPI, Cargo, and other package registries.
216+
These commands fetch package metadata from [ecosyste.ms](https://packages.ecosyste.ms/), which aggregates data from npm, RubyGems, PyPI, Cargo, and other package registries.
158217
159218
## Caching
160219
@@ -169,11 +228,12 @@ The cache stores:
169228
170229
## Stateless Mode
171230
172-
Both commands support `--stateless` mode, which parses manifest files directly from git without requiring a database. This is useful in CI environments where you don't want to run `git pkgs init` first.
231+
All three commands support `--stateless` mode, which parses manifest files directly from git without requiring a database. This is useful in CI environments where you don't want to run `git pkgs init` first.
173232
174233
```
175234
$ git pkgs outdated --stateless
176235
$ git pkgs licenses --stateless --permissive
236+
$ git pkgs sbom --stateless
177237
```
178238
179239
In stateless mode, package metadata is fetched fresh each time and cached only in memory for the duration of the command.

git-pkgs.gemspec

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,4 +38,6 @@ Gem::Specification.new do |spec|
3838
spec.add_dependency "vers", "~> 1.0"
3939
spec.add_dependency "purl", "~> 1.7", ">= 1.7.1"
4040
spec.add_dependency "sarif-ruby"
41+
spec.add_dependency "sbom", "~> 0.4"
42+
spec.add_dependency "base64"
4143
end

lib/git/pkgs.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@
5050
require_relative "pkgs/commands/outdated"
5151
require_relative "pkgs/commands/licenses"
5252
require_relative "pkgs/commands/integrity"
53+
require_relative "pkgs/commands/sbom"
5354

5455
module Git
5556
module Pkgs

lib/git/pkgs/cli.rb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,8 @@ class CLI
3636
"stale" => "Show dependencies that haven't been updated",
3737
"outdated" => "Show packages with newer versions available",
3838
"licenses" => "Show licenses for dependencies",
39-
"integrity" => "Show and verify lockfile integrity hashes"
39+
"integrity" => "Show and verify lockfile integrity hashes",
40+
"sbom" => "Export dependencies as SBOM (SPDX or CycloneDX)"
4041
},
4142
"Security" => {
4243
"vulns" => "Scan for known vulnerabilities"

lib/git/pkgs/commands/licenses.rb

Lines changed: 55 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -118,21 +118,21 @@ def run
118118
end
119119

120120
packages = deps.map do |dep|
121-
purl = PurlHelper.build_purl(ecosystem: dep[:ecosystem], name: dep[:name]).to_s
121+
versioned_purl = PurlHelper.build_purl(ecosystem: dep[:ecosystem], name: dep[:name], version: dep[:requirement])
122+
base_purl = PurlHelper.build_purl(ecosystem: dep[:ecosystem], name: dep[:name])
122123
{
123-
purl: purl,
124+
purl: versioned_purl.to_s,
125+
base_purl: base_purl.to_s,
124126
name: dep[:name],
125127
ecosystem: dep[:ecosystem],
126128
version: dep[:requirement],
127129
manifest_path: dep[:manifest_path]
128130
}
129131
end.uniq { |p| p[:purl] }
130132

131-
enrich_packages(packages.map { |p| p[:purl] })
133+
enrich_packages(packages)
132134

133135
packages.each do |pkg|
134-
db_pkg = Models::Package.first(purl: pkg[:purl])
135-
pkg[:license] = db_pkg&.license
136136
pkg[:violation] = check_violation(pkg[:license])
137137
end
138138

@@ -183,9 +183,14 @@ def license_matches?(license, pattern)
183183
license.downcase.include?(pattern.downcase)
184184
end
185185

186-
def enrich_packages(purls)
186+
def enrich_packages(packages)
187+
client = EcosystemsClient.new
188+
189+
# Enrich package-level data (license, latest version)
190+
base_purls = packages.map { |p| p[:base_purl] }.uniq
191+
187192
packages_by_purl = {}
188-
purls.each do |purl|
193+
base_purls.each do |purl|
189194
parsed = Purl::PackageURL.parse(purl)
190195
ecosystem = PurlHelper::ECOSYSTEM_TO_PURL_TYPE.invert[parsed.type] || parsed.type
191196
pkg = Models::Package.find_or_create_by_purl(
@@ -196,19 +201,52 @@ def enrich_packages(purls)
196201
packages_by_purl[purl] = pkg
197202
end
198203

199-
stale_purls = packages_by_purl.select { |_, pkg| pkg.needs_enrichment? }.keys
200-
return if stale_purls.empty?
204+
stale_pkg_purls = packages_by_purl.select { |_, pkg| pkg.needs_enrichment? }.keys
201205

202-
client = EcosystemsClient.new
203-
begin
204-
results = Spinner.with_spinner("Fetching package metadata...") do
205-
client.bulk_lookup(stale_purls)
206+
if stale_pkg_purls.any?
207+
begin
208+
results = Spinner.with_spinner("Fetching package metadata...") do
209+
client.bulk_lookup(stale_pkg_purls)
210+
end
211+
results.each do |purl, data|
212+
packages_by_purl[purl]&.enrich_from_api(data)
213+
end
214+
rescue EcosystemsClient::ApiError => e
215+
$stderr.puts "Warning: Could not fetch package data: #{e.message}" unless Git::Pkgs.quiet
206216
end
207-
results.each do |purl, data|
208-
packages_by_purl[purl]&.enrich_from_api(data)
217+
end
218+
219+
# Enrich version-level data (license, integrity, published_at)
220+
versions_by_purl = {}
221+
packages.each do |pkg|
222+
version = Models::Version.find_or_create_by_purl(
223+
purl: pkg[:purl],
224+
package_purl: pkg[:base_purl]
225+
)
226+
versions_by_purl[pkg[:purl]] = version
227+
end
228+
229+
stale_version_purls = versions_by_purl.select { |_, v| v.needs_enrichment? }.keys
230+
231+
if stale_version_purls.any?
232+
begin
233+
Spinner.with_spinner("Fetching version metadata...") do
234+
stale_version_purls.each do |purl|
235+
data = client.lookup_version(purl)
236+
versions_by_purl[purl]&.enrich_from_api(data) if data
237+
end
238+
end
239+
rescue EcosystemsClient::ApiError => e
240+
$stderr.puts "Warning: Could not fetch version data: #{e.message}" unless Git::Pkgs.quiet
209241
end
210-
rescue EcosystemsClient::ApiError => e
211-
$stderr.puts "Warning: Could not fetch package data: #{e.message}" unless Git::Pkgs.quiet
242+
end
243+
244+
# Apply enriched data to packages - version license takes priority
245+
packages.each do |pkg|
246+
db_pkg = packages_by_purl[pkg[:base_purl]]
247+
db_version = versions_by_purl[pkg[:purl]]
248+
249+
pkg[:license] = db_version&.license || db_pkg&.license
212250
end
213251
end
214252

0 commit comments

Comments
 (0)