-
Notifications
You must be signed in to change notification settings - Fork 701
fix: continue PR #430 - hook dedup + resource singleton + backup timer guard #612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| ## 問題背景 | ||
|
|
||
| **PR #430**(已關閉)嘗試解決 Hook handler 累積問題,但 scope 太大(+207/-372 行)被關閉。現有 WeakSet guard(`index.ts:1855`)只能阻擋相同 API 實例,無法防止新 API 實例重複註冊。 | ||
|
|
||
| **Issue #610** 是基於 PR #430 設計稿的接續追蹤 issue。 | ||
|
|
||
| --- | ||
|
|
||
| ## 本 PR 處理的內容(基於 PR #603) | ||
|
|
||
| 本 PR 從官方 `CortexReach/memory-lancedb-pro` 的 PR #603 cherry-pick 而來,實作了以下 5 個 memory leak 修復: | ||
|
|
||
| ### 1. Store.ts:Promise Chain 無限增長(CRITICAL) | ||
| - **問題**:`updateQueue` promise chain 無限增長,寫入速度快於完成時 heap 飆升 | ||
| - **修復**:廢除 promise chain,改用 `_updating` boolean flag + FIFO `_waitQueue` | ||
| - **效果**:Tail-reset semaphore,記憶體恆定 | ||
|
|
||
| ### 2. AccessTracker:Failed ID 累積(HIGH) | ||
| - **問題**:寫入失敗的 ID 每次 flush 都累積 delta,map 無限增長 | ||
| - **修復**:分離 `_retryCount` map,設 `_maxRetries=5` 上限,超過後 drop | ||
| - **效果**:失敗 ID 不會無限重試 | ||
|
|
||
| ### 3. Embedder.ts:TTL 只在 access 清理(MEDIUM) | ||
| - **問題**:過期 entry 只在 access 時時刪除,閒置 entry 佔用記憶體 | ||
| - **修復**:每次 `set()` 時若 near capacity 就呼叫 `_evictExpired()` 清理過期 entry | ||
| - **效果**:快取容量有上限 | ||
|
|
||
| ### 4. RetrievalStats.ts:O(n) shift(MEDIUM) | ||
| - **問題**:`Array.shift()` 是 O(n),1000 筆資料時每次搬遷造成 GC 壓力 | ||
| - **修復**:改用 Ring Buffer,O(1) 寫入 | ||
| - **效果**:無 GC 壓力 | ||
|
|
||
| ### 5. NoisePrototypeBank:DEDUP_THRESHOLD 0.95→0.90(MEDIUM) | ||
| - **問題**:0.95 threshold 太寬鬆,near-duplicate noise 持續累積 | ||
| - **修復**:降低至 0.90,更接近實際 `isNoise()` threshold 0.82 | ||
| - **效果**:noise bank 不會被 near-duplicate 填滿 | ||
|
|
||
| --- | ||
|
|
||
| ## 驗證 | ||
|
|
||
| - `test/issue598_smoke.mjs` 煙霧測試已加入 | ||
| - 原始 PR #603 的所有 commit 已 cherry-pick:`cd695ba` → `30c6dc9` → `810adf9` | ||
|
|
||
| --- | ||
|
|
||
| ## 相關連結 | ||
|
|
||
| - Issue #598(原始):https://github.com/CortexReach/memory-lancedb-pro/issues/598 | ||
| - Issue #610(新設計追蹤):https://github.com/CortexReach/memory-lancedb-pro/issues/610 | ||
| - PR #430(已關閉):https://github.com/CortexReach/memory-lancedb-pro/pull/430 | ||
| - PR #603(官方):https://github.com/CortexReach/memory-lancedb-pro/pull/603 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -213,6 +213,9 @@ export function computeEffectiveHalfLife( | |
| */ | ||
| export class AccessTracker { | ||
| private readonly pending: Map<string, number> = new Map(); | ||
| // Tracks retry count per ID so that delta is never amplified across failures. | ||
| private readonly _retryCount = new Map<string, number>(); | ||
| private readonly _maxRetries = 5; | ||
| private debounceTimer: ReturnType<typeof setTimeout> | null = null; | ||
| private flushPromise: Promise<void> | null = null; | ||
| private readonly debounceMs: number; | ||
|
|
@@ -291,10 +294,22 @@ export class AccessTracker { | |
| this.clearTimer(); | ||
| if (this.pending.size > 0) { | ||
| this.logger.warn( | ||
| `access-tracker: destroying with ${this.pending.size} pending writes`, | ||
| `access-tracker: destroying with ${this.pending.size} pending writes — attempting final flush (3s timeout)`, | ||
| ); | ||
| // Fire-and-forget final flush with a hard 3s timeout. Uses Promise.race | ||
| // to guarantee we always clear pending/_retryCount even if flush hangs. | ||
| const flushWithTimeout = Promise.race([ | ||
| this.doFlush(), | ||
| new Promise<void>((resolve) => setTimeout(resolve, 3_000)), | ||
| ]); | ||
| void flushWithTimeout.finally(() => { | ||
| this.pending.clear(); | ||
| this._retryCount.clear(); | ||
| }); | ||
| } else { | ||
| this.pending.clear(); | ||
| this._retryCount.clear(); | ||
| } | ||
| this.pending.clear(); | ||
| } | ||
|
|
||
| // -------------------------------------------------------------------------- | ||
|
|
@@ -308,18 +323,33 @@ export class AccessTracker { | |
| for (const [id, delta] of batch) { | ||
| try { | ||
| const current = await this.store.getById(id); | ||
| if (!current) continue; | ||
| if (!current) { | ||
| // ID not found — memory was deleted or outside current scope. | ||
| // Do NOT retry or warn; just drop silently and clear any retry counter. | ||
| this._retryCount.delete(id); | ||
| continue; | ||
| } | ||
|
|
||
| const updatedMeta = buildUpdatedMetadata(current.metadata, delta); | ||
| await this.store.update(id, { metadata: updatedMeta }); | ||
| this._retryCount.delete(id); // success — clear retry counter | ||
| } catch (err) { | ||
| // Requeue failed delta for retry on next flush | ||
| const existing = this.pending.get(id) ?? 0; | ||
| this.pending.set(id, existing + delta); | ||
| this.logger.warn( | ||
| `access-tracker: write-back failed for ${id.slice(0, 8)}:`, | ||
| err, | ||
| ); | ||
| const retryCount = (this._retryCount.get(id) ?? 0) + 1; | ||
| if (retryCount > this._maxRetries) { | ||
| // Exceeded max retries — drop and log error. | ||
| this._retryCount.delete(id); | ||
| this.logger.error( | ||
| `access-tracker: dropping ${id.slice(0, 8)} after ${retryCount} failed retries`, | ||
| ); | ||
| } else { | ||
| this._retryCount.set(id, retryCount); | ||
| // Requeue with the original delta only (NOT accumulated) for next flush. | ||
| this.pending.set(id, delta); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
When a flush is in progress, Useful? React with 👍 / 👎. |
||
| this.logger.warn( | ||
| `access-tracker: write-back failed for ${id.slice(0, 8)} (attempt ${retryCount}/${this._maxRetries}):`, | ||
| err, | ||
| ); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| /** | ||
| * Smoke test for: skip before_prompt_build hooks for subagent sessions | ||
| * Bug: sub-agent sessions cause gateway blocking — hooks without subagent skip | ||
| * run LanceDB I/O sequentially, blocking all other user sessions. | ||
| * | ||
| * Run: node test/issue598_smoke.mjs | ||
| * Expected: all 3 hooks PASS | ||
| */ | ||
|
|
||
| import { readFileSync } from "fs"; | ||
|
|
||
| const FILE = "C:\\Users\\admin\\.openclaw\\extensions\\memory-lancedb-pro\\index.ts"; | ||
| const content = readFileSync(FILE, "utf-8"); | ||
| const lines = content.split("\n"); | ||
|
|
||
| // [hook_opens_line, guard_line, name] | ||
| const checks = [ | ||
| [2223, 2226, "auto-recall before_prompt_build"], | ||
| [3084, 3087, "reflection-injector inheritance"], | ||
| [3113, 3116, "reflection-injector derived"], | ||
| ]; | ||
|
|
||
| let pass = 0, fail = 0; | ||
| for (const [hookLine, guardLine, name] of checks) { | ||
| const hookContent = (lines[hookLine - 1] || "").trim(); | ||
| const guardContent = (lines[guardLine - 1] || "").trim(); | ||
| if (hookContent.includes("before_prompt_build") && guardContent.includes(":subagent:")) { | ||
| console.log(`PASS ${name.padEnd(40)} hook@${hookLine} guard@${guardLine}`); | ||
| pass++; | ||
| } else { | ||
| console.log(`FAIL ${name}`); | ||
| console.log(` hook@${hookLine}: ${hookContent}`); | ||
| console.log(` guard@${guardLine}: ${guardContent}`); | ||
| fail++; | ||
| } | ||
| } | ||
|
|
||
| console.log(`\n${pass}/${pass + fail} checks passed`); | ||
| if (fail > 0) process.exit(1); | ||
| else console.log("ALL PASSED — subagent sessions skipped before async work"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AccessTrackerOptionsonly requireswarn(with optionalinfo), but this branch unconditionally callslogger.error; integrations that provide a warn-only logger (which the current type allows) will throwTypeErrorafter max retries are exceeded, aborting flush handling exactly on persistent failures. Use an optional call/fallback towarnto keep retry exhaustion non-fatal.Useful? React with 👍 / 👎.