Skip to content

SmartExtractor should use bulkStore() to reduce lock acquisitions #666

@jlin53882

Description

@jlin53882

Problem

SmartExtractor.extractAndPersist() calls store.store() individually for each candidate, resulting in N lock acquisitions for N candidates.

Example with 5 candidates:
\
for (candidate of candidates):
store.store(candidate) // acquires lock → releases lock (repeat 5 times)
\\

This causes unnecessary lock contention when auto-capture produces multiple memories.

Root Cause

SmartExtractor was not updated to use the bulkStore() method added in Issue #665.

Solution (Implemented)

Changes to src/smart-extractor.ts

  1. Added buildStoreEntry() helper - constructs entry object without writing
  2. Added StoreEntry type alias - cleaner type: \Omit<MemoryEntry, 'id' | 'timestamp'>\
  3. Added createEntries[] collection in extractAndPersist()
  4. Modified processCandidate() - accepts createEntries parameter, pushes instead of stores
  5. Updated handler methods to batch CREATE decisions:
    • handleProfileMerge()
    • handleSupersede()
    • handleContextualize()
    • handleContradict()
  6. Final bulkStore() call at end of extractAndPersist()

Key Design Decision

  • Solution A selected: Batch CREATE/CONTEXTUALIZE/MERGE-fallback decisions
  • SUPERSEDE cannot be batched - needs new entry ID to update old entry metadata
  • Each SUPERSEDE = 2 locks (store new + update old), but this is by design

Test Results

✅ Unit Tests: 34 tests, 0 failures

smart-extractor-bulk-store.test.mjs (9 tests)

Test Status Duration
CURRENT: store() causes N lock acquisitions ✅ PASS 122.24ms
FIXED: bulkStore() uses 1 lock for N entries ✅ PASS 29.55ms
should achieve 80% lock reduction with bulkStore (5 entries) ✅ PASS 169.02ms
should achieve 90% lock reduction with bulkStore (10 entries) ✅ PASS 341.48ms
should handle empty entries array ✅ PASS 31.78ms
should handle single entry batch ✅ PASS 30.41ms
should preserve entry order in results ✅ PASS 15.18ms
should handle entries with different scopes ✅ PASS 15.94ms
should batch all LLM candidates into single lock ✅ PASS 155.70ms

smart-extractor-bulk-store-edge-cases.test.mjs (17 tests)

Test Status
should have bulkStore method on store interface ✅ PASS
bulkStore should accept array and return array ✅ PASS
bulkStore should use single lock for multiple entries ✅ PASS
CREATE decision: 1 store call = 1 lock ✅ PASS
SUPERSEDE decision: 1 store + 1 update = 2 locks ✅ PASS
CONTRADICT decision: 1 update + 1 store = 2 locks ✅ PASS
MERGE decision: requires read then write ✅ PASS
should handle very large batch (100 entries) ✅ PASS
should handle entries with special characters ✅ PASS
should preserve metadata from entries ✅ PASS
should handle mixed scope entries in single batch ✅ PASS
should generate unique IDs for each entry ✅ PASS
should add timestamp to entries ✅ PASS
Scenario: 3 candidates with different decisions ✅ PASS
Scenario: What if all were batched with bulkStore? ✅ PASS
Maximum lock reduction: N CREATE decisions ✅ PASS
Minimum lock reduction: N SUPERSEDE decisions ✅ PASS

bulk-store.test.mjs + bulk-store-edge-cases.test.mjs (8 tests)

Test Status Duration
should store multiple entries with single lock ✅ PASS 182.93ms
should handle concurrent bulkStore calls ✅ PASS 934.76ms
should handle empty array ✅ PASS 15.73ms
should handle single entry ✅ PASS 19.39ms

Lock Reduction Performance

Scenario Before After Reduction
5 CREATE entries 5 locks 1 lock 80%
10 CREATE entries 10 locks 1 lock 90%
4 mixed candidates 4+ locks 1 lock 75%+

Files Modified

Branch

https://github.com/jlin53882/memory-lancedb-pro/tree/fix/auto-capture-batch-write

Related Issues

Claude Adversarial Review

All 3 critical bugs found during review were fixed:

  1. ✅ Bug 1: handleProfileMerge call parameter order fixed
  2. ✅ Bug 2: handleMerge missing createEntries parameter added
  3. ✅ Bug 3: scopeFilter type made optional in handleSupersede

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions