Skip to content
Merged
Show file tree
Hide file tree
Changes from 54 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
1a466f1
Fix missing delete
jess-lowe Aug 28, 2025
f5b40e4
Remove Linux special treatment
jess-lowe Aug 29, 2025
bad798b
Improve inverse version extraction
jess-lowe Aug 29, 2025
b62ceb3
Fix double commit hash sections
jess-lowe Aug 29, 2025
86ace30
Merge remote-tracking branch 'upstream/master' into feat/cve-conv-cro…
jess-lowe Aug 29, 2025
b4f5b57
Remove database_specific map from initial vuln instance
jess-lowe Sep 1, 2025
c352ac1
fix version_extraction test
jess-lowe Sep 1, 2025
f17b8c5
rename and repackage cvelist converstion
jess-lowe Sep 1, 2025
0638dff
repackage
jess-lowe Sep 1, 2025
f7b0de3
cvelist mass converter script
jess-lowe Sep 1, 2025
0fc040d
Remove outcomes logic for now (not very concurrency-friendly)
jess-lowe Sep 1, 2025
227ba92
Converter script
jess-lowe Sep 1, 2025
e2f5b64
rename files back
jess-lowe Sep 1, 2025
c56e9d5
dockerfile and cron job
jess-lowe Sep 2, 2025
7df4e7c
Merge remote-tracking branch 'upstream/master' into feat/cve-conv-cro…
jess-lowe Sep 4, 2025
cf40025
flatten if statements
jess-lowe Sep 4, 2025
394d039
Revert "dockerfile and cron job"
jess-lowe Sep 8, 2025
34e3505
Merge remote-tracking branch 'upstream/master' into feat/cve-conv-cro…
jess-lowe Sep 8, 2025
013c5c8
update logger
jess-lowe Sep 8, 2025
34878fb
fix lint through refactoring everything :(
jess-lowe Sep 8, 2025
af05357
Added flags and removed double parsing
jess-lowe Sep 9, 2025
9b2ce86
rename sortBadSemver
jess-lowe Sep 9, 2025
68f8d55
refactored parts for clarity
jess-lowe Sep 9, 2025
28e0691
deal with if number of parts are not 2 or 3
jess-lowe Sep 9, 2025
e69b7fb
rename sortBadSemver in tests
jess-lowe Sep 9, 2025
0114a27
Refactor VersionToCommit to ONLY return a commit, not the AffectedCom…
jess-lowe Sep 9, 2025
5fa7c47
update test snapshots
jess-lowe Sep 9, 2025
fbdd60f
FIX LINT
jess-lowe Sep 9, 2025
56c6db5
MUCH PRETTIER CODE
jess-lowe Sep 9, 2025
40b1d9f
Merge remote-tracking branch 'upstream/master' into feat/cve-git-reso…
jess-lowe Sep 9, 2025
3c528d7
Merge branch 'refactor/versions-to-commit' into feat/cve-git-resolve-1
jess-lowe Sep 9, 2025
d36ada9
Enable git commit extraction in affected field extracted vulns.
jess-lowe Sep 10, 2025
650290d
Save unresolved version ranges in database_specific
jess-lowe Sep 10, 2025
77eab2a
fix the concurrency issue with Metrics.Notes
jess-lowe Sep 10, 2025
e8d344a
Fix testcases
jess-lowe Sep 12, 2025
18ca577
Merge remote-tracking branch 'upstream/master' into feat/cve-git-reso…
jess-lowe Sep 12, 2025
7641ad1
fix logging issues
jess-lowe Sep 12, 2025
da21f5d
fix lint
jess-lowe Sep 12, 2025
49299c0
fix folder name
jess-lowe Sep 12, 2025
89b4541
improve naming clarity
jess-lowe Sep 15, 2025
667d6cf
update source for affected
jess-lowe Sep 15, 2025
ac0a979
fix lint
jess-lowe Sep 16, 2025
b9909a7
wrong place
jess-lowe Sep 16, 2025
459eb95
attempt to Parse oneliner range
jess-lowe Sep 18, 2025
f975c3b
fix lint
jess-lowe Sep 18, 2025
60ad54b
Merge branch 'master' into feat/cve-git-resolve-1
jess-lowe Sep 19, 2025
24d71de
Treat version ranges git resolution as a stack to reduce chances of d…
jess-lowe Sep 19, 2025
8be73f6
Assume if only one version that value is LAST AFFECTED not fixed
jess-lowe Sep 19, 2025
208f809
refactor some duplicate code and fix tests
jess-lowe Sep 21, 2025
31bdcab
fix lint
jess-lowe Sep 21, 2025
389349d
Merge branch 'master' into feat/cve-git-resolve-1
jess-lowe Sep 29, 2025
eca80e8
Remove duplicate code
jess-lowe Sep 29, 2025
360883d
Merge branch 'master' into feat/cve-git-resolve-1
jess-lowe Oct 1, 2025
0ea46b7
Make clearer where passing as a pointer
jess-lowe Oct 1, 2025
23e8ca3
refactor gitToCommits
jess-lowe Oct 1, 2025
62ddc33
fix lint
jess-lowe Oct 1, 2025
bacdfb5
Merge branch 'master' into feat/cve-git-resolve-1
jess-lowe Oct 1, 2025
ccf9a28
fix typo
jess-lowe Oct 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion vulnfeeds/cmd/cve-bulk-converter/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ var (
localOutputDir = flag.String("out_dir", "cvelist2osv", "Path to output results.")
years = flag.String("years", "2022,2023,2024,2025", "A comma-separated list of years to process.")
workers = flag.Int("workers", 30, "The number of concurrent workers to use for processing CVEs.")
cnas = flag.String("cnas", "Linux", "A comma-separated list of CNAs to process.")
cnas = flag.String("cnas", "Linux,GitHub_M", "A comma-separated list of CNAs to process.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EEF CNA is reporting GIT Versions. Can we be added to this list? Example CVE: https://cna.erlef.org/cves/cve-2025-48042.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @maennchen, thanks for your enthusiasm with the CVE5 conversion! I tried running the converter with EEF scoped vulns and unfortunately in its current state it's not able to convert EEF vulns effectively, as the data layout differs from the more common expressions I've seen across Linux/GitHub_M/MITRE vulns. Looking a little more into it, it looks like the vulns are generated using Vulnogram, in which case, I'll consider adding an extension to handle Vulnogram generated vulns in a future PR :).

For a more timely ingestion of EEF vulns into OSV, we would highly appreciate Erlang publishing natively in the OSV format for us to ingest (also removes the layer of abstraction from going through the CVE Program to publish changes, and waiting for us to ingest those vulns).

)

func main() {
Expand Down
25 changes: 12 additions & 13 deletions vulnfeeds/cmd/cvelist2osv/converter.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,21 +79,25 @@ func FromCVE5(cve cves.CVE5, refs []cves.Reference, metrics *ConversionMetrics)

published, err := cves.ParseCVE5Timestamp(cve.Metadata.DatePublished)
if err != nil {
metrics.Notes = append(metrics.Notes, "Published date failed to parse, setting time to now")
metrics.Notes = append(metrics.Notes, fmt.Sprintf("[%s]: Published date failed to parse, setting time to now", cve.Metadata.CVEID))
published = time.Now()
}
v.Published = published

modified, err := cves.ParseCVE5Timestamp(cve.Metadata.DateUpdated)
if err != nil {
metrics.Notes = append(metrics.Notes, "Modified date failed to parse, setting time to now")
metrics.Notes = append(metrics.Notes, fmt.Sprintf("[%s]: Modified date failed to parse, setting time to now", cve.Metadata.CVEID))
modified = time.Now()
}
v.Modified = modified

// Add affected version information.
AddVersionInfo(cve, &v, metrics)
// Try to extract repository URLs from references.
repos, repoNotes := cves.ReposFromReferencesCVEList(string(cve.Metadata.CVEID), refs, RefTagDenyList)
metrics.Notes = append(metrics.Notes, repoNotes...)
metrics.Repos = repos

// Add affected version information.
AddVersionInfo(cve, &v, metrics, repos)
// TODO(jesslowe@): Add CWEs.

// Combine severity metrics from both CNA and ADP containers.
Expand Down Expand Up @@ -166,17 +170,12 @@ func ConvertAndExportCVEToOSV(cve cves.CVE5, directory string) error {
cveID := cve.Metadata.CVEID
cnaAssigner := cve.Metadata.AssignerShortName
references := identifyPossibleURLs(cve)
metrics := &ConversionMetrics{}
metrics := ConversionMetrics{}
// Create a base OSV record from the CVE.
v := FromCVE5(cve, references, metrics)
v := FromCVE5(cve, references, &metrics)

// Collect metrics about the conversion.
extractConversionMetrics(cve, v.References, metrics)

// Try to extract repository URLs from references.
repos, repoNotes := cves.ReposFromReferencesCVEList(string(cveID), references, RefTagDenyList)
metrics.Notes = append(metrics.Notes, repoNotes...)
metrics.Repos = repos
extractConversionMetrics(cve, v.References, &metrics)

vulnDir := filepath.Join(directory, cnaAssigner)

Expand All @@ -186,7 +185,7 @@ func ConvertAndExportCVEToOSV(cve cves.CVE5, directory string) error {
}

// Save the conversion metrics to a file.
if err := writeMetricToFile(cveID, vulnDir, metrics); err != nil {
if err := writeMetricToFile(cveID, vulnDir, &metrics); err != nil {
return err
}

Expand Down
35 changes: 25 additions & 10 deletions vulnfeeds/cmd/cvelist2osv/converter_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -191,13 +191,18 @@ func TestFromCVE5(t *testing.T) {
},
},
Affected: []osvschema.Affected{{
// DatabaseSpecific: map[string]interface{}{
// "CPE": []string{"cpe:2.3:a:gitlab:gitlab:*:*:*:*:*:*:*:*"},
// },

Ranges: []osvschema.Range{{Type: "ECOSYSTEM",
Events: []osvschema.Event{{Introduced: "18.0"}, {Fixed: "18.0.1"}},
}}}},
Ranges: []osvschema.Range{{
Type: "GIT",
Repo: "https://gitlab.com/gitlab-org/gitlab",
Events: []osvschema.Event{
{Introduced: "504fd9e5236e13d674e051c6b8a1e9892b371c58"},
{Fixed: "3426be1b93852c5358240c5df40970c0ddfbdb2a"},
},
DatabaseSpecific: map[string]any{
"versions": []osvschema.Event{{Introduced: "18.0"}, {Fixed: "18.0.1"}},
},
}},
}},
},
},
},
Expand Down Expand Up @@ -226,9 +231,19 @@ func TestFromCVE5(t *testing.T) {
Score: "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H",
},
},
Affected: []osvschema.Affected{{Ranges: []osvschema.Range{{
Type: "ECOSYSTEM",
Events: []osvschema.Event{{Introduced: "0"}, {Fixed: "1.10.5"}}}}}},
Affected: []osvschema.Affected{{
Ranges: []osvschema.Range{{
Type: "GIT",
Repo: "https://github.com/amazon-ion/ion-java",
Events: []osvschema.Event{
{Introduced: "0"},
{Fixed: "019a6117fb99131f74f92ecf462169613234abbf"},
},
DatabaseSpecific: map[string]any{
"versions": []osvschema.Event{{Introduced: "0"}, {Fixed: "1.10.5"}},
},
}},
}},
DatabaseSpecific: nil,
},
},
Expand Down
148 changes: 126 additions & 22 deletions vulnfeeds/cmd/cvelist2osv/version_extraction.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,15 @@ import (
"cmp"
"errors"
"fmt"
"log/slog"
"strconv"
"strings"

"slices"

"github.com/google/osv/vulnfeeds/cves"
"github.com/google/osv/vulnfeeds/git"
"github.com/google/osv/vulnfeeds/utility/logger"
"github.com/google/osv/vulnfeeds/vulns"
"github.com/ossf/osv-schema/bindings/go/osvschema"
)
Expand Down Expand Up @@ -72,7 +74,7 @@ func toVersionRangeType(s string) VersionRangeType {
// 3. If no versions are found, it falls back to searching for CPEs in the CNA container.
// 4. As a last resort, it attempts to extract version information from the description text (currently not saved).
// It returns the source of the version information and a slice of notes detailing the extraction process.
func AddVersionInfo(cve cves.CVE5, v *vulns.Vulnerability, metrics *ConversionMetrics) {
func AddVersionInfo(cve cves.CVE5, v *vulns.Vulnerability, metrics *ConversionMetrics, repos []string) {
gotVersions := false

// Combine 'affected' entries from both CNA and ADP containers.
Expand Down Expand Up @@ -104,31 +106,36 @@ func AddVersionInfo(cve cves.CVE5, v *vulns.Vulnerability, metrics *ConversionMe
hasGit = true
}

aff := osvschema.Affected{}
for _, vr := range versionRanges {
if versionType == VersionRangeTypeGit {
vr.Type = osvschema.RangeGit
vr.Repo = cveAff.Repo
} else {
vr.Type = osvschema.RangeEcosystem
}
aff.Ranges = append(aff.Ranges, vr)
}

var aff osvschema.Affected
// Special handling for Linux kernel CVEs.
if cve.Metadata.AssignerShortName == "Linux" && versionType != VersionRangeTypeGit {
aff.Package = osvschema.Package{
Ecosystem: string(osvschema.EcosystemLinux),
Name: "Kernel",
if cve.Metadata.AssignerShortName == "Linux" {
for _, vr := range versionRanges {
if versionType == VersionRangeTypeGit {
vr.Type = osvschema.RangeGit
vr.Repo = cveAff.Repo
} else {
vr.Type = osvschema.RangeEcosystem
}
aff.Ranges = append(aff.Ranges, vr)
}
if versionType != VersionRangeTypeGit {
aff.Package = osvschema.Package{
Ecosystem: string(osvschema.EcosystemLinux),
Name: "Kernel",
}
}
} else {
var err error
aff, err = gitVersionsToCommits(cve.Metadata.CVEID, versionRanges, repos, make(git.RepoTagsCache))
if err != nil {
logger.Error("Failed to convert git versions to commits", slog.Any("err", err))
} else {
hasGit = true
}
}

v.Affected = append(v.Affected, aff)
if hasGit {
metrics.VersionSources = append(metrics.VersionSources, VersionSourceGit)
} else {
metrics.VersionSources = append(metrics.VersionSources, VersionSourceAffected)
}
metrics.VersionSources = append(metrics.VersionSources, VersionSourceAffected)
}

// If no versions were found so far, fall back to CPEs.
Expand Down Expand Up @@ -164,6 +171,104 @@ func AddVersionInfo(cve cves.CVE5, v *vulns.Vulnerability, metrics *ConversionMe
}
}

// Examines repos and tries to convert versions to commits by treating them as Git tags.
// Takes a CVE ID string (for logging), VersionInfo with AffectedVersions and
// typically no AffectedCommits and attempts to add AffectedCommits (including Fixed commits) where there aren't any.
// Refuses to add the same commit to AffectedCommits more than once.
func gitVersionsToCommits(cveID cves.CVEID, versionRanges []osvschema.Range, repos []string, cache git.RepoTagsCache) (osvschema.Affected, error) {
var newAff osvschema.Affected
var newVersionRanges []osvschema.Range
unresolvedRanges := versionRanges

for _, repo := range repos {
if len(unresolvedRanges) == 0 {
break // All ranges have been resolved.
}

normalizedTags, err := git.NormalizeRepoTags(repo, cache)
if err != nil {
logger.Warn("Failed to normalize tags", slog.String("cve", string(cveID)), slog.String("repo", repo), slog.Any("err", err))
continue
}

var stillUnresolvedRanges []osvschema.Range
for _, vr := range unresolvedRanges {
var introducedCommit, fixedCommit, lastAffectedCommit string
var resolutionErr error

for _, ev := range vr.Events {
logger.Info("Attempting version resolution", slog.String("cve", string(cveID)), slog.Any("event", ev), slog.String("repo", repo))
if ev.Introduced != "" {
if ev.Introduced == "0" {
introducedCommit = "0"
} else {
introducedCommit, resolutionErr = git.VersionToCommit(ev.Introduced, normalizedTags)
if resolutionErr != nil {
logger.Warn("Failed to get Git commit for introduced version", slog.String("cve", string(cveID)), slog.String("version", ev.Introduced), slog.String("repo", repo), slog.Any("err", resolutionErr))
} else {
logger.Info("Successfully derived commit for introduced version", slog.String("cve", string(cveID)), slog.String("commit", introducedCommit), slog.String("version", ev.Introduced))
}
}
}
if ev.Fixed != "" {
fixedCommit, resolutionErr = git.VersionToCommit(ev.Fixed, normalizedTags)
if resolutionErr != nil {
logger.Warn("Failed to get Git commit for fixed version", slog.String("cve", string(cveID)), slog.String("version", ev.Fixed), slog.String("repo", repo), slog.Any("err", resolutionErr))
} else {
logger.Info("Successfully derived commit for fixed version", slog.String("cve", string(cveID)), slog.String("commit", fixedCommit), slog.String("version", ev.Fixed))
}
}
if ev.LastAffected != "" {
lastAffectedCommit, resolutionErr = git.VersionToCommit(ev.LastAffected, normalizedTags)
if resolutionErr != nil {
logger.Warn("Failed to get Git commit for last affected version", slog.String("cve", string(cveID)), slog.String("version", ev.LastAffected), slog.String("repo", repo), slog.Any("err", resolutionErr))
} else {
logger.Info("Successfully derived commit for last affected version", slog.String("cve", string(cveID)), slog.String("commit", lastAffectedCommit), slog.String("version", ev.LastAffected))
}
}
}

resolved := false
if fixedCommit != "" && introducedCommit != "" {
newVR := buildVersionRange(introducedCommit, "", fixedCommit)
newVR.Repo = repo
newVR.Type = osvschema.RangeGit
newVR.DatabaseSpecific = make(map[string]any)
newVR.DatabaseSpecific["versions"] = vr.Events
newVersionRanges = append(newVersionRanges, newVR)
resolved = true
} else if lastAffectedCommit != "" && introducedCommit != "" {
newVR := buildVersionRange(introducedCommit, lastAffectedCommit, "")
newVR.Repo = repo
newVR.Type = osvschema.RangeGit
newVR.DatabaseSpecific = make(map[string]any)
newVR.DatabaseSpecific["versions"] = vr.Events
newVersionRanges = append(newVersionRanges, newVR)
resolved = true
}

if !resolved {
stillUnresolvedRanges = append(stillUnresolvedRanges, vr)
}
}
unresolvedRanges = stillUnresolvedRanges
}

var err error
if len(unresolvedRanges) > 0 {
newAff.DatabaseSpecific = make(map[string]any)
newAff.DatabaseSpecific["unresolved_versions"] = unresolvedRanges
}

if len(newVersionRanges) > 0 {
newAff.Ranges = newVersionRanges
} else if len(unresolvedRanges) > 0 { // Only error if there were ranges to resolve but none were.
err = errors.New("was not able to get git version ranges")
}

return newAff, err
}

// findCPEVersionRanges extracts version ranges and CPE strings from the CNA's
// CPE applicability statements in a CVE record.
func findCPEVersionRanges(cve cves.CVE5) (versionRanges []osvschema.Range, cpes []string, err error) {
Expand Down Expand Up @@ -343,7 +448,6 @@ func findNormalAffectedRanges(affected cves.Affected, metrics *ConversionMetrics
// affected, but more likely, it affects up to that version. It could also mean that the range is given
// in one line instead - like "< 1.5.3" or "< 2.45.4, >= 2.0 " or just "before 1.4.7", so check for that.
metrics.Notes = append(metrics.Notes, "Only version exists")
// GitHub often encodes the range directly in the version string.

av, err := git.ParseVersionRange(vers.Version)
if err == nil {
Expand Down
Loading