Fix storage growth from oversized conversations and recording artifacts#1061
Fix storage growth from oversized conversations and recording artifacts#1061
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
🤖 Augment PR SummarySummary: This PR adds disk-growth guardrails for desktop conversation and recording storage by truncating large payloads, enforcing retention limits, and proactively cleaning up stale artifacts. Changes:
Technical Notes: Retention/cleanup is driven by new config keys (history max items/days, file days, orphan grace period) and runs during history reconciliation to keep storage bounded over time. 🤖 Was this summary useful? React with 👍 or 👎 |
apps/desktop/src/main/llm.ts
Outdated
| // Avoid persisting duplicate payloads for tool messages. | ||
| // Tool message content already contains the rendered result text. | ||
| // Persisting full toolResults again can double disk usage for large outputs. | ||
| const shouldPersistToolResults = role !== "tool" |
There was a problem hiding this comment.
shouldPersistToolResults = role !== "tool" means tool result messages will be persisted without toolResults, which may break downstream consumers that expect structured toolResults for tool messages (e.g., pairing tool calls/results when reconstructing prior history in tipc.ts / remote-server.ts). If the stored tool message content is later truncated, this also becomes the only persisted copy of tool output, potentially losing more information than intended.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| continue | ||
| } | ||
| try { | ||
| fs.unlinkSync(conversationPath) |
There was a problem hiding this comment.
applyConversationRetention() deletes conversation files as part of index normalization/updates; since deletion happens before the updated index is guaranteed persisted, an index write failure could leave the index referencing conversations that were already deleted (data loss/inconsistency). This side effect also occurs on history reads via normalizeIndex(), which may be surprising if callers expect getConversationHistory() to be read-only.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| return fallback | ||
| } | ||
| const rounded = Math.floor(value) | ||
| return Math.min(Math.max(rounded, 1_000), MAX_ALLOWED_TOOL_OUTPUT_CHARS) |
There was a problem hiding this comment.
clampMaxOutputChars() enforces a minimum of 1,000 chars, but the tool schema description only documents the default and max cap; callers trying to aggressively limit output (e.g., for storage control) won’t be able to set values below 1,000 as implied. Consider documenting the minimum behavior to avoid an API-contract mismatch.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
- Always persist toolResults for all message roles (including tool) so downstream consumers (tipc.ts, remote-server.ts) can reconstruct structured tool call/result pairs. Storage truncation in conversation-service handles size limits. - Move conversation file deletion in applyConversationRetention to after successful index write, preventing index/file inconsistency on write failure. Extracted deleteRetainedOverflowFiles() helper and updated both updateConversationIndex() and getConversationHistory() to follow write-then-delete ordering. - Document minimum 1000-char bound for maxOutputChars in builtin tool schema description to avoid API-contract mismatch for callers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
augment review |
| const historyItem = historyById.get(recordingId) | ||
|
|
||
| // Remove orphan files (no matching history entry). | ||
| if (!historyItem) { |
There was a problem hiding this comment.
cleanupRecordingFiles() deletes any recording file that has no matching history.json entry, but createMcpRecording writes ${recordingId}.webm before it appends the history item (and calls getRecordingHistory() before history.push). This can delete a just-created recording as an “orphan” and leave history entries pointing at missing audio files.
Severity: high
Other Locations
apps/desktop/src/main/tipc.ts:1988apps/desktop/src/main/tipc.ts:1752
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| tokens = estimateTokensFromMessages(messages) | ||
| if (tokens <= targetTokens) { | ||
| if (isDebugLLM()) logLLM("ContextBudget: after aggressive_truncate", { estTokens: tokens }) | ||
| return { messages, appliedStrategies: applied, estTokensBefore: tokens, estTokensAfter: tokens, maxTokens } |
There was a problem hiding this comment.
In the aggressive_truncate early-return, estTokensBefore and estTokensAfter are both set to the post-truncation tokens, so any reporting can’t reflect how much was actually reduced.
Severity: low
Other Locations
apps/desktop/src/main/context-budget.ts:703apps/desktop/src/main/context-budget.ts:771apps/desktop/src/main/context-budget.ts:791
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
…reporting - tipc.ts: add 5-minute grace period before deleting orphan recording files in cleanupRecordingFiles() to prevent race with async history persistence (recording file is written before history entry is saved) - context-budget.ts: fix estTokensBefore in aggressive_truncate early-return to reflect pre-truncation token count for accurate reduction reporting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
augment review |
|
|
||
| const omitted = originalLength - maxChars | ||
| return { | ||
| text: `${input.slice(0, maxChars)}\n\n[truncated ${omitted} characters]`, |
There was a problem hiding this comment.
| } | ||
|
|
||
| // Heuristic: aggressively truncate oversized tool payloads first. | ||
| const looksLikeToolPayload = |
There was a problem hiding this comment.
looksLikeToolPayload can match large user messages that happen to contain JSON keys like "id": / "url":, causing aggressive truncation of the user’s prompt even though this tier is described as “tool responses”. Consider tightening the predicate to reduce accidental truncation of non-tool content.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| } | ||
| }) | ||
|
|
||
| const { kept: retained, removed } = this.applyConversationRetention(normalized) |
There was a problem hiding this comment.
In normalizeIndex(), retention (applyConversationRetention) runs before any sort; if index.json is out of order, this can drop (and then delete) non-oldest conversations. Consider sorting by updatedAt desc before slicing so retention deterministically keeps the most recent entries.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
apps/desktop/src/main/tipc.ts
Outdated
| // to disk before the history entry is saved (async processing window), so | ||
| // a recently-created file may not yet appear in history. | ||
| if (!historyItem) { | ||
| const ORPHAN_GRACE_PERIOD_MS = 5 * 60 * 1000 // 5 minutes |
There was a problem hiding this comment.
The orphan grace period is hard-coded to 5 minutes; if transcription/agent processing takes longer and getRecordingHistory() runs in the meantime, a valid in-flight recording file could still be deleted as an orphan. Consider making the grace window configurable or keyed off active recording/session state.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
…ention ordering - builtin-tools: fix truncateToolText so content+footer never exceeds maxChars - context-budget: capture initialTokens before any tier so estTokensBefore reflects the pre-shrink count in all return paths - context-budget: restrict looksLikeToolPayload JSON-key heuristics to non-user messages to avoid aggressively truncating user prompts - conversation-service: sort index by updatedAt desc before applying retention in normalizeIndex so the most recent conversations are always kept - tipc: introduce inFlightRecordingIds set so cleanupRecordingFiles never treats an actively-processed recording as an orphan; retain the 5-minute time-based grace period as a fallback for post-crash recovery Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
augment review |
| } | ||
|
|
||
| const omitted = text.length - maxChars | ||
| return `${text.slice(0, maxChars)}\n\n[truncated ${omitted} characters]` |
There was a problem hiding this comment.
truncateText() appends a footer after slicing to maxChars, so the stored string can exceed the intended cap. If these caps are meant to be hard limits to control on-disk growth, this can still allow payloads larger than configured.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
|
|
||
| // Add to beginning of array (most recent first) | ||
| index.unshift(indexItem) | ||
| const { kept, removed } = this.applyConversationRetention(index) |
There was a problem hiding this comment.
Retention in updateConversationIndex() is applied to the current index ordering without sorting by updatedAt. If index.json is ever out of order (e.g., older versions or corruption), this can delete non-oldest conversation files as “overflow”.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| const entry = item as Partial<RecordingHistoryItem> | ||
| return ( | ||
| typeof entry.id === "string" && | ||
| typeof entry.createdAt === "number" && |
There was a problem hiding this comment.
isValidRecordingHistoryItem() allows createdAt/duration values like NaN/Infinity because it only checks typeof === "number". Those values can break sorting and retention comparisons in reconcileRecordingHistory() / cleanupRecordingFiles().
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
… validation - conversation-service: fix truncateText() footer overshoot (content+footer now capped at maxChars) - conversation-service: sort index by updatedAt desc before applying retention in updateConversationIndex - tipc: add Number.isFinite() checks for createdAt/duration in isValidRecordingHistoryItem - tipc: protect queued-path recording writes with inFlightRecordingIds and persist history entry - tipc: make orphan grace period configurable via recordingOrphanGracePeriodMinutes setting (default 5 min) - types/config: add recordingOrphanGracePeriodMinutes config field with default Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
augment review |
Combines async index I/O + debounced writes from main with storage sanitization/truncation/retention logic from this branch. Key merge decisions: - saveConversation now sanitizes via sanitizeConversationForStorage before the async fsPromises.writeFile (both features preserved) - updateConversationIndex applies truncateIndexText, sort, and retention inside the enqueueIndexMutation queue before updating in-memory cache - scheduleDiskWrite accepts optional removedItems to delete overflow files after the write completes (preserving write-before-delete ordering) - deleteRetainedOverflowFiles converted to async using fsPromises.unlink - getConversationHistory uses ensureIndexLoaded (in-memory cache) instead of reading from disk on every call Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| if (text.length <= MAX_INDEX_LAST_MESSAGE_CHARS) { | ||
| return text | ||
| } | ||
| return `${text.slice(0, MAX_INDEX_LAST_MESSAGE_CHARS)}...` |
There was a problem hiding this comment.
truncateIndexText() slices to MAX_INDEX_LAST_MESSAGE_CHARS but then appends ..., so the returned string can exceed the intended cap. If this value is meant to be a hard storage bound, callers may still end up persisting >500 chars here.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| fs.readFileSync(getRecordingsHistoryPath(), "utf8"), | ||
| ) as unknown | ||
| const history = Array.isArray(raw) ? raw.filter(isValidRecordingHistoryItem) : [] | ||
| return reconcileRecordingHistory(history, true) |
There was a problem hiding this comment.
getRecordingHistory() calls reconcileRecordingHistory(history, true), so reads will synchronously write history.json and run cleanup. If the write fails (e.g., permissions/disk full), the surrounding try/catch returns [], which can mask otherwise-readable history from callers.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Summary
lastMessageentries and enforcemaxConversationsToKeepretention with file pruningexecute_commandoutput sizetoolmessage content +toolResults).webmand.wavartifacts for recording item removal and cleanup failed MCP recording writesValidation
pnpm --filter @speakmcp/desktop typecheckpnpm --filter @speakmcp/desktop typecheck:nodepnpm --filter @speakmcp/desktop test:runNotes