Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ require (
github.com/google/osv-scalibr v0.3.5-0.20251002191929-de9496dc5aa2
github.com/tidwall/jsonc v0.3.2
golang.org/x/mod v0.30.0
golang.org/x/sync v0.16.0
gopkg.in/yaml.v3 v3.0.1
)

Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ github.com/tidwall/sjson v1.2.5 h1:kLy8mja+1c9jlljvWTlSazM7cKDRfJuR/bOJhcY5NcY=
github.com/tidwall/sjson v1.2.5/go.mod h1:Fvgq9kS/6ociJEDnK0Fk1cpYF4FIW6ZF7LAe+6jwd28=
golang.org/x/mod v0.30.0 h1:fDEXFVZ/fmCKProc/yAXXUijritrDzahmwwefnjoPFk=
golang.org/x/mod v0.30.0/go.mod h1:lAsf5O2EvJeSFMiBxXDki7sCgAxEUcZHXoXMKT4GJKc=
golang.org/x/sync v0.16.0 h1:ycBJEhp9p4vXvUZNszeOq0kGTPghopOL8q0fq3vstxw=
golang.org/x/sync v0.16.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
golang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.33.0 h1:q3i8TbbEz+JRD9ywIRlyRAQbM0qF7hu24q3teo2hbuw=
Expand Down
36 changes: 30 additions & 6 deletions pkg/database/api-check.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"path"

"github.com/g-rath/osv-detector/internal"
"golang.org/x/sync/errgroup"
)

func (db APIDB) buildAPIPayload(pkg internal.PackageDetails) apiQuery {
Expand Down Expand Up @@ -171,15 +172,38 @@ func findOrDefault(vulns Vulnerabilities, def OSV) OSV {
func (db APIDB) Check(pkgs []internal.PackageDetails) ([]Vulnerabilities, error) {
batches := batchPkgs(pkgs, db.BatchSize)

vulnerabilities := make([]Vulnerabilities, 0, len(pkgs))
var eg errgroup.Group
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing concurrency limit for errgroup

Unlike FetchAll() which sets a limit of 200 concurrent operations with eg.SetLimit(200) (line 12 in api-fetch-all.go), this errgroup has no limit. This could potentially spawn a large number of goroutines if there are many batches, which may overwhelm system resources or the API server.

Suggestion: Consider adding a concurrency limit consistent with the existing pattern:

var eg errgroup.Group
eg.SetLimit(200)  // or another appropriate limit
Suggested change
var eg errgroup.Group
var eg errgroup.Group
eg.SetLimit(200)

Copilot uses AI. Check for mistakes.

for _, batch := range batches {
results, err := db.checkBatch(batch)
// use a sensible upper limit so it's not possible to have inf. operations going
// even though it's very unlikely there will be more than a couple of batches
eg.SetLimit(100)

if err != nil {
return nil, err
}
batchResults := make([][][]ObjectWithID, len(batches))

for i, batch := range batches {
eg.Go(func() error {
results, err := db.checkBatch(batch)

if err != nil {
return err
}

batchResults[i] = results
Comment on lines +184 to +191
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition: closure captures loop variable incorrectly

The closure passed to eg.Go() captures the loop variables i and batch by reference, not by value. Since multiple goroutines are created in rapid succession, they will likely all reference the final values of these variables from the loop.

This will cause multiple goroutines to process the same batch and write to the same index in batchResults, while other batches are never processed.

Solution: Capture the loop variables before the closure:

for i, batch := range batches {
    i, batch := i, batch  // capture loop variables
    eg.Go(func() error {
        results, err := db.checkBatch(batch)
        if err != nil {
            return err
        }
        batchResults[i] = results
        return nil
    })
}

Copilot uses AI. Check for mistakes.

return nil
})
}

err := eg.Wait()

if err != nil {
return nil, err
}

vulnerabilities := make([]Vulnerabilities, 0, len(pkgs))

// todo: pretty sure some of these loops and slices can be merged and simplified
for _, results := range batchResults {
for _, withIDs := range results {
vulns := make(Vulnerabilities, 0, len(withIDs))

Expand Down
12 changes: 7 additions & 5 deletions pkg/database/api-check_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -553,11 +553,15 @@ func TestAPIDB_Check_Batches(t *testing.T) {
mux.HandleFunc("/querybatch", func(w http.ResponseWriter, r *http.Request) {
requestCount++
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition in test: concurrent writes to requestCount

With parallel batch execution, multiple goroutines may increment requestCount concurrently without synchronization, leading to a race condition. This can cause incorrect request counting and test flakiness.

Solution: Use sync/atomic to safely increment the counter:

var requestCount atomic.Int32

mux.HandleFunc("/querybatch", func(w http.ResponseWriter, r *http.Request) {
    count := requestCount.Add(1)
    
    if count > 2 {
        t.Errorf("unexpected number of requests (%d)", count)
    }
    // ... rest of the handler

And at line 611:

if requestCount.Load() != 2 {

Copilot uses AI. Check for mistakes.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this as it should be very low risk (its not been an issue so far)


if requestCount > 2 {
t.Errorf("unexpected number of requests (%d)", requestCount)
}

var expectedPayload []apiQuery
var batchResponse []objectsWithIDs

switch requestCount {
case 1:
// strictly speaking not the best of checks, but it should be good enough
if r.ContentLength > 100 {
expectedPayload = []apiQuery{
{
Version: "1.0.0",
Expand All @@ -569,16 +573,14 @@ func TestAPIDB_Check_Batches(t *testing.T) {
},
}
batchResponse = []objectsWithIDs{{}, {}}
case 2:
} else if r.ContentLength > 50 {
expectedPayload = []apiQuery{
{
Version: "2.3.1",
Package: apiPackage{Name: "their-package", Ecosystem: lockfile.NpmEcosystem},
},
}
batchResponse = []objectsWithIDs{{}}
default:
t.Errorf("unexpected number of requests (%d)", requestCount)
}

expectRequestPayload(t, r, expectedPayload)
Expand Down
55 changes: 11 additions & 44 deletions pkg/database/api-fetch-all.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,65 +2,32 @@ package database

import (
"sort"
)

// a struct to hold the result from each request including an index
// which will be used for sorting the results after they come in
type result struct {
index int
res OSV
err error
}
"golang.org/x/sync/errgroup"
)

func (db APIDB) FetchAll(ids []string) Vulnerabilities {
conLimit := 200
var eg errgroup.Group

var osvs Vulnerabilities

if len(ids) == 0 {
return osvs
}
eg.SetLimit(200)

// buffered channel which controls the number of concurrent operations
semaphoreChan := make(chan struct{}, conLimit)
resultsChan := make(chan *result)

defer func() {
close(semaphoreChan)
close(resultsChan)
}()
osvs := make(Vulnerabilities, len(ids))

for i, id := range ids {
go func(i int, id string) {
// read from the buffered semaphore channel, which will block if we're
// already got as many goroutines as our concurrency limit allows
//
// when one of those routines finish they'll read from this channel,
// freeing up a slot to unblock this send
semaphoreChan <- struct{}{}

eg.Go(func() error {
// if we error, still report the vulnerability as hopefully the ID should be
// enough to manually look up the details - in future we should ideally warn
// the user too, but for now we just silently eat the error
osv, _ := db.Fetch(id)
result := &result{i, osv, nil}

resultsChan <- result
osvs[i] = osv
Comment on lines +17 to +23
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition: closure captures loop variable incorrectly

The closure passed to eg.Go() captures the loop variables i and id by reference, not by value. Since multiple goroutines are created in rapid succession, they will likely all reference the final values of i and id from the loop, causing incorrect behavior and potential race conditions.

This will result in multiple goroutines writing to the same index in the osvs slice, and potentially fetching the same ID multiple times while missing others.

Solution: Capture the loop variables by passing them as parameters to an immediately invoked function, or assign them to local variables before the closure:

for i, id := range ids {
    i, id := i, id  // capture loop variables
    eg.Go(func() error {
        osv, _ := db.Fetch(id)
        osvs[i] = osv
        return nil
    })
}

Copilot uses AI. Check for mistakes.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this is incorrect since Go 1.22


// read from the buffered semaphore to free up a slot to allow
// another goroutine to start, since this one is wrapping up
<-semaphoreChan
}(i, id)
return nil
})
}

for {
result := <-resultsChan
osvs = append(osvs, result.res)

if len(osvs) == len(ids) {
break
}
}
// errors are handled within the go routines
_ = eg.Wait()

sort.Slice(osvs, func(i, j int) bool {
return osvs[i].ID < osvs[j].ID
Expand Down
Loading