Skip to content

fix(db): chunk unresolved reference queries during sync#558

Open
msnandhis wants to merge 1 commit into
colbymchenry:mainfrom
msnandhis:fix/chunk-unresolved-ref-file-filter
Open

fix(db): chunk unresolved reference queries during sync#558
msnandhis wants to merge 1 commit into
colbymchenry:mainfrom
msnandhis:fix/chunk-unresolved-ref-file-filter

Conversation

@msnandhis
Copy link
Copy Markdown
Contributor

Summary

  • chunk unresolved-reference lookups by changed file path
  • chunk resolved-reference cleanup by source node id
  • add regression coverage for large unresolved-reference batches
  • document the fix in the unreleased changelog

Problem

Issue #540 reports codegraph sync failing on a very large checkout with too many SQL variables near the end of parsing.

getUnresolvedReferencesByFiles() built one IN (...) query containing every changed file path. Large sync batches can exceed SQLite's bind parameter limit, even though nearby node lookup code already chunks at 500 parameters for portability across SQLite backends.

deleteResolvedReferences() had the same large IN (...) shape for source node IDs, so this patch applies the same batching rule there too.

Fix

This updates both unresolved-reference paths to deduplicate inputs and process them in SQLITE_PARAM_CHUNK_SIZE chunks.

That keeps each prepared statement under the existing 500-parameter safety limit while preserving the old behavior for empty inputs and duplicate file paths or node IDs.

Closes #540.

Validation

  • npx vitest run __tests__/db-perf.test.ts
  • npx -y -p node@22 -p npm@10 npm run build
  • npx -y -p node@22 -p npm@10 npm test
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] codegraph sync fails with "too many SQL variables" during code parsing

1 participant