Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -524,6 +524,7 @@ Git::Pkgs::Database.connect(repo_git_dir)
Git::Pkgs::Models::DependencyChange.where(name: "rails").all
```


## Contributing

Bug reports, feature requests, and pull requests are welcome. If you're unsure about a change, open an issue first to discuss it.
Expand Down
177 changes: 177 additions & 0 deletions docs/enrichment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Package Enrichment

git-pkgs can fetch additional metadata about your dependencies from the [ecosyste.ms Packages API](https://packages.ecosyste.ms/). This powers the `outdated` and `licenses` commands.

## outdated

Show packages that have newer versions available in their registries.

```
$ git pkgs outdated
lodash 4.17.15 -> 4.17.21 (patch)
express 4.17.0 -> 4.19.2 (minor)
webpack 4.46.0 -> 5.90.3 (major)

3 outdated packages: 1 major, 1 minor, 1 patch
```

Major updates are shown in red, minor in yellow, patch in cyan.

### Options

```
-e, --ecosystem=NAME Filter by ecosystem
-r, --ref=REF Git ref to check (default: HEAD)
-f, --format=FORMAT Output format (text, json)
--major Show only major version updates
--minor Show only minor or major updates (skip patch)
--stateless Parse manifests directly without database
```

### Examples

Show only major updates:

```
$ git pkgs outdated --major
webpack 4.46.0 -> 5.90.3 (major)
```

Check a specific release:

```
$ git pkgs outdated v1.0.0
```

JSON output:

```
$ git pkgs outdated -f json
```

## licenses

Show licenses for dependencies with optional compliance checks.

```
$ git pkgs licenses
lodash MIT (npm)
express MIT (npm)
request Apache-2.0 (npm)
```

### Options

```
-e, --ecosystem=NAME Filter by ecosystem
-r, --ref=REF Git ref to check (default: HEAD)
-f, --format=FORMAT Output format (text, json, csv)
--allow=LICENSES Comma-separated list of allowed licenses
--deny=LICENSES Comma-separated list of denied licenses
--permissive Only allow permissive licenses (MIT, Apache, BSD, etc.)
--copyleft Flag copyleft licenses (GPL, AGPL, etc.)
--unknown Flag packages with unknown/missing licenses
--group Group output by license
--stateless Parse manifests directly without database
```

### Compliance Checks

Only allow permissive licenses:

```
$ git pkgs licenses --permissive
lodash MIT (npm)
express MIT (npm)
gpl-pkg GPL-3.0 (npm) [copyleft]

1 license violation found
```

Explicit allow list:

```
$ git pkgs licenses --allow=MIT,Apache-2.0
```

Deny specific licenses:

```
$ git pkgs licenses --deny=GPL-3.0,AGPL-3.0
```

Flag packages with no license information:

```
$ git pkgs licenses --unknown
```

### Output Formats

Group by license:

```
$ git pkgs licenses --group
MIT (45)
lodash
express
...

Apache-2.0 (12)
request
...
```

CSV for spreadsheets:

```
$ git pkgs licenses -f csv > licenses.csv
```

JSON for scripting:

```
$ git pkgs licenses -f json
```

### Exit Codes

The licenses command exits with code 1 if any violations are found. This makes it suitable for CI pipelines:

```yaml
- run: git pkgs licenses --stateless --permissive
```

### License Categories

Permissive licenses (allowed with `--permissive`):
MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, Unlicense, CC0-1.0, 0BSD, WTFPL, Zlib, BSL-1.0

Copyleft licenses (flagged with `--copyleft` or `--permissive`):
GPL-2.0, GPL-3.0, LGPL-2.1, LGPL-3.0, AGPL-3.0, MPL-2.0 (and their variant identifiers)

## Data Source

Both commands fetch package metadata from [ecosyste.ms](https://packages.ecosyste.ms/), which aggregates data from npm, RubyGems, PyPI, Cargo, and other package registries.

## Caching

Package metadata is cached in the pkgs.sqlite3 database. Each package tracks when it was last enriched, and stale data (older than 24 hours) is automatically refreshed on the next query.

The cache stores:
- Latest version number
- License (SPDX identifier)
- Description
- Homepage URL
- Repository URL

## Stateless Mode

Both commands support `--stateless` mode, which parses manifest files directly from git without requiring a database. This is useful in CI environments where you don't want to run `git pkgs init` first.

```
$ git pkgs outdated --stateless
$ git pkgs licenses --stateless --permissive
```

In stateless mode, package metadata is fetched fresh each time and cached only in memory for the duration of the command.
43 changes: 40 additions & 3 deletions docs/internals.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The executable at [`exe/git-pkgs`](../exe/git-pkgs) loads [`lib/git/pkgs.rb`](..

[`Git::Pkgs::Database`](../lib/git/pkgs/database.rb) manages the SQLite connection using [Sequel](https://sequel.jeremyevans.net/) and [sqlite3](https://github.com/sparklemotion/sqlite3-ruby). It looks for the `GIT_PKGS_DB` environment variable first, then falls back to `.git/pkgs.sqlite3`. Schema migrations are versioned through a `schema_info` table. See [schema.md](schema.md) for the full schema.

The schema has nine tables. Six handle dependency tracking:
The schema has ten tables. Six handle dependency tracking:

- `commits` holds commit metadata plus a flag indicating whether it changed dependencies
- `branches` tracks which branches have been analyzed and their last processed SHA
Expand All @@ -19,9 +19,10 @@ The schema has nine tables. Six handle dependency tracking:
- `dependency_changes` records every add, modify, or remove event
- `dependency_snapshots` stores full dependency state at intervals

Three more support vulnerability scanning:
Four more support vulnerability scanning and package enrichment:

- `packages` tracks which packages have been synced with OSV and when
- `packages` tracks package metadata, vulnerability sync status, and enrichment data
- `versions` stores per-version metadata (license, published date) for time-travel queries
- `vulnerabilities` caches CVE/GHSA data fetched from OSV
- `vulnerability_packages` maps which packages are affected by each vulnerability

Expand Down Expand Up @@ -188,6 +189,42 @@ When scanning, git-pkgs:
6. Matches version ranges against actual versions
7. Excludes withdrawn vulnerabilities

## Package Enrichment

The [`outdated`](../lib/git/pkgs/commands/outdated.rb) and [`licenses`](../lib/git/pkgs/commands/licenses.rb) commands fetch package metadata from the [ecosyste.ms Packages API](https://packages.ecosyste.ms/).

### Ecosystems Client

[`Git::Pkgs::EcosystemsClient`](../lib/git/pkgs/ecosystems_client.rb) wraps the ecosyste.ms REST API. It uses batch lookups (`POST /api/v1/packages/lookup`) to check up to 100 packages per request. The response includes latest version, license, description, and repository URL for each package.

### Enrichment Caching

Like vulnerability data, enrichment data is cached in the database. The `packages` table has an `enriched_at` timestamp. Packages are refreshed if their data is more than 24 hours old. The `Package#needs_enrichment?` method checks this threshold.

When running `outdated` or `licenses`:

1. Get dependencies at the target commit
2. Find or create package records for each purl
3. Check which packages need enrichment (never enriched or stale)
4. Batch query ecosyste.ms for those packages
5. Store the enrichment data via `Package#enrich_from_api`
6. Use the cached data for version comparison or license checking

### Version Comparison

The `outdated` command classifies updates as major, minor, or patch by comparing semver components. It handles the `v` prefix common in some ecosystems and pads partial versions (e.g., "1.2" becomes "1.2.0"). Updates are color-coded: red for major, yellow for minor, cyan for patch.

### License Compliance

The `licenses` command checks licenses against configured policies:

- `--permissive` only allows common permissive licenses (MIT, Apache-2.0, BSD variants)
- `--copyleft` flags GPL, AGPL, and similar licenses
- `--allow` and `--deny` let you specify explicit lists
- `--unknown` flags packages with no license information

The command exits with code 1 when violations are found, making it suitable for CI pipelines.

## Models

Sequel models live in [`lib/git/pkgs/models/`](../lib/git/pkgs/models/). They're straightforward except for a few convenience methods:
Expand Down
21 changes: 21 additions & 0 deletions docs/schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,25 @@ Tracks packages for vulnerability sync status.

Indexes: `purl` (unique)

### versions

Stores per-version metadata for packages.

| Column | Type | Description |
|--------|------|-------------|
| id | integer | Primary key |
| purl | string | Full versioned purl (e.g., "pkg:npm/[email protected]") |
| package_purl | string | Parent package purl (e.g., "pkg:npm/lodash") |
| license | string | License for this specific version |
| published_at | datetime | When this version was published |
| integrity | text | Integrity hash (e.g., SHA256) |
| source | string | Data source |
| enriched_at | datetime | When metadata was fetched |
| created_at | datetime | |
| updated_at | datetime | |

Indexes: `purl` (unique), `package_purl`

### vulnerabilities

Caches vulnerability data from OSV.
Expand Down Expand Up @@ -170,5 +189,7 @@ branches ──┬── branch_commits ──┬── commits
└── last_analyzed_sha (references commits.sha)

packages ──── versions (via package_purl)

vulnerabilities ──── vulnerability_packages
```
6 changes: 6 additions & 0 deletions lib/git/pkgs.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,18 @@
require_relative "pkgs/analyzer"
require_relative "pkgs/ecosystems"
require_relative "pkgs/osv_client"
require_relative "pkgs/ecosystems_client"
require_relative "pkgs/spinner"

require_relative "pkgs/purl_helper"
require_relative "pkgs/models/branch"
require_relative "pkgs/models/branch_commit"
require_relative "pkgs/models/commit"
require_relative "pkgs/models/manifest"
require_relative "pkgs/models/dependency_change"
require_relative "pkgs/models/dependency_snapshot"
require_relative "pkgs/models/package"
require_relative "pkgs/models/version"
require_relative "pkgs/models/vulnerability"
require_relative "pkgs/models/vulnerability_package"

Expand All @@ -43,6 +47,8 @@
require_relative "pkgs/commands/diff_driver"
require_relative "pkgs/commands/completions"
require_relative "pkgs/commands/vulns"
require_relative "pkgs/commands/outdated"
require_relative "pkgs/commands/licenses"

module Git
module Pkgs
Expand Down
2 changes: 1 addition & 1 deletion lib/git/pkgs/analyzer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ class Analyzer
REQUIRE Project.toml Manifest.toml
shard.yml shard.lock
elm-package.json elm_dependencies.json elm-stuff/exact-dependencies.json
haxelib.json
haxelib.json stack.yaml stack.yaml.lock
action.yml action.yaml .github/workflows/*.yml .github/workflows/*.yaml
Dockerfile docker-compose*.yml docker-compose*.yaml
dvc.yaml vcpkg.json _generated-vcpkg-list.json
Expand Down
6 changes: 4 additions & 2 deletions lib/git/pkgs/cli.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@ class CLI
},
"Analysis" => {
"stats" => "Show dependency statistics",
"stale" => "Show dependencies that haven't been updated"
"stale" => "Show dependencies that haven't been updated",
"outdated" => "Show packages with newer versions available",
"licenses" => "Show licenses for dependencies"
},
"Security" => {
"vulns" => "Scan for known vulnerabilities"
Expand All @@ -42,7 +44,7 @@ class CLI

COMMANDS = COMMAND_GROUPS.values.flat_map(&:keys).freeze
COMMAND_DESCRIPTIONS = COMMAND_GROUPS.values.reduce({}, :merge).freeze
ALIASES = { "praise" => "blame", "outdated" => "stale" }.freeze
ALIASES = { "praise" => "blame" }.freeze

def self.run(args)
new(args).run
Expand Down
6 changes: 6 additions & 0 deletions lib/git/pkgs/commands/diff_driver.rb
Original file line number Diff line number Diff line change
Expand Up @@ -24,18 +24,24 @@ class DiffDriver
gems.locked
glide.lock
go.mod
go.sum
gradle.lockfile
mix.lock
npm-shrinkwrap.json
package-lock.json
packages.lock.json
paket.lock
pdm.lock
pnpm-lock.yaml
poetry.lock
project.assets.json
pubspec.lock
pylock.toml
renv.lock
shard.lock
stack.yaml.lock
uv.lock
verification-metadata.xml
yarn.lock
].freeze

Expand Down
Loading