Fix storage growth from oversized conversations and recording artifacts by aj47 · Pull Request #1061 · aj47/SpeakMCP

aj47 · 2026-02-18T19:10:10Z

Summary

cap and sanitize persisted conversation payloads to prevent runaway file growth from large tool outputs
truncate index lastMessage entries and enforce maxConversationsToKeep retention with file pruning
process built-in tool outputs through large-response filtering/summarization and cap execute_command output size
avoid persisting duplicate tool payloads (tool message content + toolResults)
add recording cleanup settings/defaults and implement history/file retention + orphan cleanup
delete both .webm and .wav artifacts for recording item removal and cleanup failed MCP recording writes

Validation

pnpm --filter @speakmcp/desktop typecheck
pnpm --filter @speakmcp/desktop typecheck:node
pnpm --filter @speakmcp/desktop test:run

Notes

intentionally did not remove local models or skills content
local cleanup performed (outside repo): removed known oversized conversation file, pruned orphan recording files, and cleared Electron Cache/Code Cache

chatgpt-codex-connector · 2026-02-18T19:10:16Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

augmentcode · 2026-02-18T19:18:35Z

🤖 Augment PR Summary

Summary: This PR adds disk-growth guardrails for desktop conversation and recording storage by truncating large payloads, enforcing retention limits, and proactively cleaning up stale artifacts.

Changes:

Adds maxOutputChars to execute_command and truncates stdout/stderr with truncation metadata.
Sanitizes persisted conversation messages/tool results with hard caps to prevent oversized JSON files.
Truncates conversation index lastMessage and enforces maxConversationsToKeep retention with overflow file pruning.
Compacts large-on-disk conversations more aggressively (byte threshold + oversized payload detection).
Normalizes/filters MCP + built-in tool results through existing large-response filtering/summarization.
Adds recording cleanup defaults and implements history retention, file retention, and orphan cleanup (with an in-flight safeguard).

Technical Notes: Retention/cleanup is driven by new config keys (history max items/days, file days, orphan grace period) and runs during history reconciliation to keep storage bounded over time.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 3 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-02-18T19:18:35Z

apps/desktop/src/main/llm.ts

+      // Avoid persisting duplicate payloads for tool messages.
+      // Tool message content already contains the rendered result text.
+      // Persisting full toolResults again can double disk usage for large outputs.
+      const shouldPersistToolResults = role !== "tool"


shouldPersistToolResults = role !== "tool" means tool result messages will be persisted without toolResults, which may break downstream consumers that expect structured toolResults for tool messages (e.g., pairing tool calls/results when reconstructing prior history in tipc.ts / remote-server.ts). If the stored tool message content is later truncated, this also becomes the only persisted copy of tool output, potentially losing more information than intended.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T19:18:36Z

apps/desktop/src/main/conversation-service.ts

+        continue
+      }
+      try {
+        fs.unlinkSync(conversationPath)


applyConversationRetention() deletes conversation files as part of index normalization/updates; since deletion happens before the updated index is guaranteed persisted, an index write failure could leave the index referencing conversations that were already deleted (data loss/inconsistency). This side effect also occurs on history reads via normalizeIndex(), which may be surprising if callers expect getConversationHistory() to be read-only.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T19:18:36Z

apps/desktop/src/main/builtin-tools.ts

+    return fallback
+  }
+  const rounded = Math.floor(value)
+  return Math.min(Math.max(rounded, 1_000), MAX_ALLOWED_TOOL_OUTPUT_CHARS)


clampMaxOutputChars() enforces a minimum of 1,000 chars, but the tool schema description only documents the default and max cap; callers trying to aggressively limit output (e.g., for storage control) won’t be able to set values below 1,000 as implied. Consider documenting the minimum behavior to avoid an API-contract mismatch.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

- Always persist toolResults for all message roles (including tool) so downstream consumers (tipc.ts, remote-server.ts) can reconstruct structured tool call/result pairs. Storage truncation in conversation-service handles size limits. - Move conversation file deletion in applyConversationRetention to after successful index write, preventing index/file inconsistency on write failure. Extracted deleteRetainedOverflowFiles() helper and updated both updateConversationIndex() and getConversationHistory() to follow write-then-delete ordering. - Document minimum 1000-char bound for maxOutputChars in builtin tool schema description to avoid API-contract mismatch for callers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

aj47 · 2026-02-18T21:05:46Z

augment review

augmentcode

Review completed. 2 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-02-18T21:10:57Z

apps/desktop/src/main/tipc.ts

+      const historyItem = historyById.get(recordingId)
+
+      // Remove orphan files (no matching history entry).
+      if (!historyItem) {


cleanupRecordingFiles() deletes any recording file that has no matching history.json entry, but createMcpRecording writes ${recordingId}.webm before it appends the history item (and calls getRecordingHistory() before history.push). This can delete a just-created recording as an “orphan” and leave history entries pointing at missing audio files.

Severity: high

Other Locations

apps/desktop/src/main/tipc.ts:1988

apps/desktop/src/main/tipc.ts:1752

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T21:10:57Z

apps/desktop/src/main/context-budget.ts

+      tokens = estimateTokensFromMessages(messages)
+      if (tokens <= targetTokens) {
+        if (isDebugLLM()) logLLM("ContextBudget: after aggressive_truncate", { estTokens: tokens })
+        return { messages, appliedStrategies: applied, estTokensBefore: tokens, estTokensAfter: tokens, maxTokens }


In the aggressive_truncate early-return, estTokensBefore and estTokensAfter are both set to the post-truncation tokens, so any reporting can’t reflect how much was actually reduced.

Severity: low

Other Locations

apps/desktop/src/main/context-budget.ts:703

apps/desktop/src/main/context-budget.ts:771

apps/desktop/src/main/context-budget.ts:791

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

…reporting - tipc.ts: add 5-minute grace period before deleting orphan recording files in cleanupRecordingFiles() to prevent race with async history persistence (recording file is written before history entry is saved) - context-budget.ts: fix estTokensBefore in aggressive_truncate early-return to reflect pre-truncation token count for accurate reduction reporting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

aj47 · 2026-02-18T21:14:16Z

augment review

augmentcode

Review completed. 4 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-02-18T21:18:18Z

apps/desktop/src/main/builtin-tools.ts

+
+  const omitted = originalLength - maxChars
+  return {
+    text: `${input.slice(0, maxChars)}\n\n[truncated ${omitted} characters]`,


truncateToolText() slices to maxChars and then appends a truncation footer, so the returned text can exceed maxOutputChars. If callers rely on a hard cap for storage/LLM context, this may still overshoot.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T21:18:18Z

apps/desktop/src/main/context-budget.ts

+    }
+
+    // Heuristic: aggressively truncate oversized tool payloads first.
+    const looksLikeToolPayload =


looksLikeToolPayload can match large user messages that happen to contain JSON keys like "id": / "url":, causing aggressive truncation of the user’s prompt even though this tier is described as “tool responses”. Consider tightening the predicate to reduce accidental truncation of non-tool content.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T21:18:18Z

apps/desktop/src/main/conversation-service.ts

+      }
+    })
+
+    const { kept: retained, removed } = this.applyConversationRetention(normalized)


In normalizeIndex(), retention (applyConversationRetention) runs before any sort; if index.json is out of order, this can drop (and then delete) non-oldest conversations. Consider sorting by updatedAt desc before slicing so retention deterministically keeps the most recent entries.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T21:18:18Z

apps/desktop/src/main/tipc.ts

+      // to disk before the history entry is saved (async processing window), so
+      // a recently-created file may not yet appear in history.
+      if (!historyItem) {
+        const ORPHAN_GRACE_PERIOD_MS = 5 * 60 * 1000 // 5 minutes


The orphan grace period is hard-coded to 5 minutes; if transcription/agent processing takes longer and getRecordingHistory() runs in the meantime, a valid in-flight recording file could still be deleted as an orphan. Consider making the grace window configurable or keyed off active recording/session state.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

…ention ordering - builtin-tools: fix truncateToolText so content+footer never exceeds maxChars - context-budget: capture initialTokens before any tier so estTokensBefore reflects the pre-shrink count in all return paths - context-budget: restrict looksLikeToolPayload JSON-key heuristics to non-user messages to avoid aggressively truncating user prompts - conversation-service: sort index by updatedAt desc before applying retention in normalizeIndex so the most recent conversations are always kept - tipc: introduce inFlightRecordingIds set so cleanupRecordingFiles never treats an actively-processed recording as an orphan; retain the 5-minute time-based grace period as a fallback for post-crash recovery Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

aj47 · 2026-02-18T21:23:46Z

augment review

augmentcode

Review completed. 3 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-02-18T21:30:04Z

apps/desktop/src/main/conversation-service.ts

+    }
+
+    const omitted = text.length - maxChars
+    return `${text.slice(0, maxChars)}\n\n[truncated ${omitted} characters]`


truncateText() appends a footer after slicing to maxChars, so the stored string can exceed the intended cap. If these caps are meant to be hard limits to control on-disk growth, this can still allow payloads larger than configured.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T21:30:04Z

apps/desktop/src/main/conversation-service.ts


      // Add to beginning of array (most recent first)
      index.unshift(indexItem)
+      const { kept, removed } = this.applyConversationRetention(index)


Retention in updateConversationIndex() is applied to the current index ordering without sorting by updatedAt. If index.json is ever out of order (e.g., older versions or corruption), this can delete non-oldest conversation files as “overflow”.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T21:30:04Z

apps/desktop/src/main/tipc.ts

+  const entry = item as Partial<RecordingHistoryItem>
+  return (
+    typeof entry.id === "string" &&
+    typeof entry.createdAt === "number" &&


isValidRecordingHistoryItem() allows createdAt/duration values like NaN/Infinity because it only checks typeof === "number". Those values can break sorting and retention comparisons in reconcileRecordingHistory() / cleanupRecordingFiles().

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

… validation - conversation-service: fix truncateText() footer overshoot (content+footer now capped at maxChars) - conversation-service: sort index by updatedAt desc before applying retention in updateConversationIndex - tipc: add Number.isFinite() checks for createdAt/duration in isValidRecordingHistoryItem - tipc: protect queued-path recording writes with inFlightRecordingIds and persist history entry - tipc: make orphan grace period configurable via recordingOrphanGracePeriodMinutes setting (default 5 min) - types/config: add recordingOrphanGracePeriodMinutes config field with default Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

aj47 · 2026-02-18T21:37:16Z

augment review

Combines async index I/O + debounced writes from main with storage sanitization/truncation/retention logic from this branch. Key merge decisions: - saveConversation now sanitizes via sanitizeConversationForStorage before the async fsPromises.writeFile (both features preserved) - updateConversationIndex applies truncateIndexText, sort, and retention inside the enqueueIndexMutation queue before updating in-memory cache - scheduleDiskWrite accepts optional removedItems to delete overflow files after the write completes (preserving write-before-delete ordering) - deleteRetainedOverflowFiles converted to async using fsPromises.unlink - getConversationHistory uses ensureIndexLoaded (in-memory cache) instead of reading from disk on every call Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

augmentcode

Review completed. 2 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-02-18T21:42:27Z

apps/desktop/src/main/conversation-service.ts

+    if (text.length <= MAX_INDEX_LAST_MESSAGE_CHARS) {
+      return text
+    }
+    return `${text.slice(0, MAX_INDEX_LAST_MESSAGE_CHARS)}...`


truncateIndexText() slices to MAX_INDEX_LAST_MESSAGE_CHARS but then appends ..., so the returned string can exceed the intended cap. If this value is meant to be a hard storage bound, callers may still end up persisting >500 chars here.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-18T21:42:27Z

apps/desktop/src/main/tipc.ts

+      fs.readFileSync(getRecordingsHistoryPath(), "utf8"),
+    ) as unknown
+    const history = Array.isArray(raw) ? raw.filter(isValidRecordingHistoryItem) : []
+    return reconcileRecordingHistory(history, true)


getRecordingHistory() calls reconcileRecordingHistory(history, true), so reads will synchronously write history.json and run cleanup. If the write fails (e.g., permissions/disk full), the surrounding try/catch returns [], which can mask otherwise-readable history from callers.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

Fix storage growth from oversized conversation and recording artifacts

a7b2e5d

augmentcode bot reviewed Feb 18, 2026

View reviewed changes

Conversation

aj47 commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Notes

Uh oh!

chatgpt-codex-connector bot commented Feb 18, 2026

Uh oh!

augmentcode bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

aj47 commented Feb 18, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

aj47 commented Feb 18, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

aj47 commented Feb 18, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

aj47 commented Feb 18, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

aj47 commented Feb 18, 2026 •

edited

Loading

augmentcode bot commented Feb 18, 2026 •

edited

Loading