-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix: reduce task file size by optimizing message saves during streaming #7852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Skip saving partial messages during streaming to avoid excessive disk writes - Add debounced save mechanism with configurable delay and maxWait - Flush pending saves when streaming ends to ensure data persistence - Add comprehensive tests for DebouncedSave utility This fix addresses issue #7851 where task files were growing to absurd sizes (95GB) due to the entire message array being saved on every partial message update during streaming operations.
| private timer: NodeJS.Timeout | null = null | ||
| private maxTimer: NodeJS.Timeout | null = null | ||
| private pendingSave: (() => Promise<void>) | null = null | ||
| private maxWaitStartTime: number = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'maxWaitStartTime' property is assigned and reset in the code but never actually used in any decision-making. Consider removing it if it’s not needed to simplify the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing my own code is like debugging in a mirror - everything looks backwards but the bugs are still mine.
| try { | ||
| this.debouncedSave.flush().catch(console.error) | ||
| this.debouncedSave.dispose() | ||
| } catch (error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The debounced save should be flushed in abortTask before disposal to ensure any pending saves complete. Currently it's only handled in dispose(). Consider adding a flush before the dispose call to prevent data loss.
| console.error("Failed to save Roo messages:", error) | ||
| // If immediate save is requested or we're not streaming, save immediately | ||
| if (immediate || !this.isStreaming) { | ||
| await saveFunction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here could be clearer. When immediate is false but isStreaming is also false, it saves immediately anyway. Is this intentional? Consider adding a comment to clarify this behavior.
| // Initialize debounced save for streaming operations | ||
| this.debouncedSave = new DebouncedSave({ | ||
| delay: 500, // 500ms debounce for streaming updates | ||
| maxWait: 2000, // Force save after 2 seconds max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be beneficial to make these delays configurable for different use cases or testing scenarios? The hardcoded values might not be optimal for all situations.
| if (this.pendingSave) { | ||
| const saveFunction = this.pendingSave | ||
| this.pendingSave = null | ||
| this.maxWaitStartTime = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The maxWaitStartTime property is reset here and elsewhere but never actually used for calculations. Is this intentional or leftover from a previous implementation? If it's not needed, consider removing it to simplify the code.
|
Closing, see parent issue |
Summary
Fixes #7851 - Task files growing to excessive sizes (95GB instead of expected <300MB)
Problem
During streaming operations, the entire message array was being saved to disk on every partial message update. This resulted in:
Solution
Implemented a debounced save mechanism that:
Changes
New Files
src/utils/debouncedSave.ts- Utility class for debounced save operations with configurable delay and maxWait timerssrc/utils/__tests__/debouncedSave.test.ts- Comprehensive test suite (15 test cases, all passing)Modified Files
src/core/task/Task.ts- Integrated debounced saving during streaming operationsTechnical Details
The
DebouncedSaveclass implements:Testing
Performance Impact
Expected improvements:
Verification
To verify the fix:
Important
Introduces
DebouncedSaveto optimize message saving during streaming, reducing disk writes and file sizes significantly.DebouncedSaveclass indebouncedSave.tsto optimize message saving during streaming.Task.tsto handle message saving efficiently.debouncedSave.test.tswith 15 test cases covering debouncing, timer resets, maxWait enforcement, error handling, and disposal.This description was created by
for 26518cc. You can customize this summary. It will automatically update as commits are pushed.