Skip to content
This repository was archived by the owner on Jan 22, 2026. It is now read-only.

Commit e3c1e88

Browse files
committed
Update internals documentation to include vulns details
1 parent e024f74 commit e3c1e88

File tree

1 file changed

+60
-3
lines changed

1 file changed

+60
-3
lines changed

docs/internals.md

Lines changed: 60 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ The executable at [`exe/git-pkgs`](../exe/git-pkgs) loads [`lib/git/pkgs.rb`](..
88

99
## Database
1010

11-
[`Git::Pkgs::Database`](../lib/git/pkgs/database.rb) manages the SQLite connection using [ActiveRecord](https://github.com/rails/rails/tree/main/activerecord) and [sqlite3](https://github.com/sparklemotion/sqlite3-ruby). It looks for the `GIT_PKGS_DB` environment variable first, then falls back to `.git/pkgs.sqlite3`. Schema migrations are versioned through a `schema_info` table. See [schema.md](schema.md) for the full schema.
11+
[`Git::Pkgs::Database`](../lib/git/pkgs/database.rb) manages the SQLite connection using [Sequel](https://sequel.jeremyevans.net/) and [sqlite3](https://github.com/sparklemotion/sqlite3-ruby). It looks for the `GIT_PKGS_DB` environment variable first, then falls back to `.git/pkgs.sqlite3`. Schema migrations are versioned through a `schema_info` table. See [schema.md](schema.md) for the full schema.
1212

13-
The schema has six main tables:
13+
The schema has nine tables. Six handle dependency tracking:
1414

1515
- `commits` holds commit metadata plus a flag indicating whether it changed dependencies
1616
- `branches` tracks which branches have been analyzed and their last processed SHA
@@ -19,6 +19,12 @@ The schema has six main tables:
1919
- `dependency_changes` records every add, modify, or remove event
2020
- `dependency_snapshots` stores full dependency state at intervals
2121

22+
Three more support vulnerability scanning:
23+
24+
- `packages` tracks which packages have been synced with OSV and when
25+
- `vulnerabilities` caches CVE/GHSA data fetched from OSV
26+
- `vulnerability_packages` maps which packages are affected by each vulnerability
27+
2228
Snapshots exist because replaying thousands of change records to answer "what dependencies existed at commit X?" would be slow. Instead, we store the complete dependency set every 50 commits by default. Point-in-time queries find the nearest snapshot and replay only the changes since then.
2329

2430
## Git Access
@@ -131,9 +137,60 @@ This hybrid approach means `where` shows current file contents rather than histo
131137

132138
Create a new file in [`lib/git/pkgs/commands/`](../lib/git/pkgs/commands/). Define `self.description` for help text and `self.run(args)` as the entry point. The CLI finds commands by constantizing the argument.
133139

140+
## Vulnerability Scanning
141+
142+
The [`vulns` command](../lib/git/pkgs/commands/vulns.rb) checks dependencies against the [OSV database](https://osv.dev). Three additional tables support this:
143+
144+
- `packages` tracks which packages have been checked and when
145+
- `vulnerabilities` caches CVE/GHSA metadata (severity, summary, dates)
146+
- `vulnerability_packages` maps which packages are affected by each vulnerability
147+
148+
### OSV Client
149+
150+
[`Git::Pkgs::OsvClient`](../lib/git/pkgs/osv_client.rb) wraps the OSV REST API. It uses batch queries (`/querybatch`) to check up to 1000 packages per request, then fetches full details for each vulnerability found (`/vulns/{id}`). HTTP connections are reused across requests.
151+
152+
### Ecosystem Mapping
153+
154+
OSV uses different ecosystem names than bibliothecary. [`Git::Pkgs::Ecosystems`](../lib/git/pkgs/ecosystems.rb) translates between them:
155+
156+
| bibliothecary | OSV | purl |
157+
|---------------|-----|------|
158+
| rubygems | RubyGems | gem |
159+
| npm | npm | npm |
160+
| pypi | PyPI | pypi |
161+
| cargo | crates.io | cargo |
162+
| go | Go | golang |
163+
| maven | Maven | maven |
164+
| nuget | NuGet | nuget |
165+
| packagist | Packagist | composer |
166+
| hex | Hex | hex |
167+
| pub | Pub | pub |
168+
169+
Only these ecosystems support vulnerability scanning. Others (Docker, Actions, etc.) are tracked for dependency history but have no OSV coverage.
170+
171+
### Version Matching
172+
173+
[`VulnerabilityPackage#affects_version?`](../lib/git/pkgs/models/vulnerability_package.rb) uses the [vers](https://github.com/package-url/vers) gem to check if a version falls within an affected range. OSV returns ranges like `>=1.0.0 <2.0.0` or `<4.17.21`. The vers gem handles semver comparison across different ecosystems.
174+
175+
Version ranges can have multiple OR conditions separated by `||`. Each condition is checked independently: `<1.0 || >=2.0 <3.0` means "affected if below 1.0 OR between 2.0 and 3.0".
176+
177+
### Caching
178+
179+
Vulnerability data is cached in the database to avoid repeated API calls. Each package in the `packages` table has a `vulns_synced_at` timestamp. Packages are refreshed if their data is more than 24 hours old. The `vulns sync --refresh` command forces a full refresh.
180+
181+
When scanning, git-pkgs:
182+
183+
1. Gets dependencies at the target commit (from snapshots or by parsing manifests)
184+
2. Filters to ecosystems with OSV support
185+
3. Checks which packages need syncing (never synced or stale)
186+
4. Batch queries OSV for those packages
187+
5. Fetches full vulnerability details for any new CVEs found
188+
6. Matches version ranges against actual versions
189+
7. Excludes withdrawn vulnerabilities
190+
134191
## Models
135192

136-
ActiveRecord models live in [`lib/git/pkgs/models/`](../lib/git/pkgs/models/). They're straightforward except for a few convenience methods:
193+
Sequel models live in [`lib/git/pkgs/models/`](../lib/git/pkgs/models/). They're straightforward except for a few convenience methods:
137194

138195
- `Commit.find_or_create_from_repo(repository, sha)` handles partial SHA resolution
139196
- `Manifest.find_or_create(path, ecosystem, kind)` uses a cache to avoid repeated lookups during init

0 commit comments

Comments
 (0)