Skip to content

fix(store): add realpath:false to prevent ENOENT after stale lock cleanup (Issue #670)#673

Closed
jlin53882 wants to merge 11 commits intoCortexReach:masterfrom
jlin53882:fix/issue670-v2
Closed

fix(store): add realpath:false to prevent ENOENT after stale lock cleanup (Issue #670)#673
jlin53882 wants to merge 11 commits intoCortexReach:masterfrom
jlin53882:fix/issue670-v2

Conversation

@jlin53882
Copy link
Copy Markdown
Contributor

Summary

Fix Issue #670: ENOENT from proper-lockfile.realpath() after proactive stale lock cleanup.

Root Cause

The proactive stale lock cleanup (introduced in PR #626) deletes a stale .memory-write.lock file. When proper-lockfile subsequently tries to acquire the lock, it calls fs.realpath() on the now-deleted file, resulting in ENOENT.

Fix

Add realpath: false option to lockfile.lock() call in src/store.ts:

\\ s
const release = await lockfile.lock(lockPath, {
realpath: false, // Fix #670: skip realpath() to avoid ENOENT
retries: {...},
stale: 10000,
});
\\

Why This Works

  • The lock path is always an absolute resolved path from config (config.dataDir/.memory-write.lock)
  • realpath resolution is unnecessary for absolute paths
  • proper-lockfile will create the lock file itself when acquiring the lock
  • This avoids the race condition between stale lock deletion and realpath() call

Related

…ortexReach#415)

對齊 upstream/master(包含 PR CortexReach#626 proactive cleanup),並保留 James 針對
Issue CortexReach#415 的修復:

- 從 PR CortexReach#626 引入 proactive cleanup(age > 5 分鐘的 stale lock 自動清除)
- 【修復 CortexReach#415】保守 retries 設定:
  - minTimeout: 1000ms(避免高負載下過度密集重試)
  - maxTimeout: 30000ms(支撐更久的 event loop 阻塞)
  - stale: 10000ms
- 【修復 CortexReach#415】onCompromised flag:lock compromised 時不立即崩潰,
  由 finally block 統一處理 fn() 錯誤 vs compromisedErr 的抛出邏輯
- 新增 lock-stress-test.mjs:驗證並發寫入、重試行為、stress test

PR CortexReach#517: CortexReach/memory-lancedb-pro
Issue CortexReach#415: ECOMPROMISED crash under event-loop pressure
Tests verify:
1. Current implementation: lock acquired once per entry (not per batch)
2. Two-phase approach: lock acquired once per batch (N -> 1)
3. Concurrent writes: read-modify-write has data consistency boundaries
4. Plugin vs Upgrader: updates different fields, no direct overwriting

Ref: Issue CortexReach#632
Two-Phase Processing:
- Phase 1: LLM enrichment (no lock)
- Phase 2: Single lock per batch for DB writes

Reduces lock contention from N locks (one per entry) to 1 lock per batch.

Tests:
- Lock count verification: 10 entries = 1 lock (was 10)
- LLM failure graceful degradation
- Batch boundary handling
- 100 entries stress test
- Added REFACTORING NOTE at class level explaining why upgradeEntry was split
- Documented the key difference: OLD = lock per entry, NEW = 1 lock per batch
- Added comments in prepareEntry explaining it contains the SAME logic as old upgradeEntry
- Added comments in writeEnrichedBatch explaining the single lock approach
- Added detailed comments in upgrade() showing before/after example
- Added inline comments in the batch loop explaining why Phase 1 doesn't hold lock
The old test was designed to verify the BUGGY behavior (1 lock per entry).
This update changes it to verify the FIXED behavior (1 lock per batch).

This aligns with Issue CortexReach#632 fix - the test now confirms:
- 3 entries = 1 lock (was 3 locks before)
…ortexReach#640)

- Changed abstract from imperative to descriptive format
- Old: 'LanceDB BigInt error -> Use Number() coercion before arithmetic'
- New: 'LanceDB BigInt numeric handling issue'
- Added unit test to verify prompt format detection
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@jlin53882
Copy link
Copy Markdown
Contributor Author

透過 #674 取代 #673 PR 關閉

@jlin53882 jlin53882 closed this Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] ENOENT from proper-lockfile realpath() after proactive stale lock cleanup

1 participant