-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: implement read_file history deduplication #6280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add READ_FILE_DEDUPLICATION experimental feature flag - Implement deduplicateReadFileHistory method in Task class - Integrate deduplication into readFileTool after successful reads - Add comprehensive unit tests for deduplication functionality - Preserve cache window (5 minutes) to avoid modifying recent messages - Handle both single-file and multi-file read operations - Support legacy read_file format for backward compatibility Fixes #6279
|
@roomote-agent does this only trigger the deduplication on reads |
|
Hi @hannesrudolph, I see your question about the deduplication trigger. Let me investigate the implementation to confirm when exactly the deduplication is triggered and get back to you with a detailed answer. |
…ss path - Deduplication now only triggers after successful read_file operations - Fixed test mocks to include deduplicateReadFileHistory method - All tests passing
|
Hi @hannesrudolph, Ive investigated and fixed the issue. You were correct - the deduplication was being called in the error handler instead of only on successful reads. Ive moved the The deduplication now only triggers after successful The fix has been pushed to the PR branch and CI checks are currently running. |
|
|
||
| mockCline.recordToolUsage = vi.fn().mockReturnValue(undefined) | ||
| mockCline.recordToolError = vi.fn().mockReturnValue(undefined) | ||
| mockCline.deduplicateReadFileHistory = vi.fn().mockResolvedValue(undefined) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good: The new deduplicateReadFileHistory mock is added to cline. Consider adding an assertion in one of the tests to ensure this function is called exactly once during a successful read_file operation, to cover that behavior explicitly.
|
@roomote-agent Run pr-reviewer on this |
|
Hi @hannesrudolph, I see your request to run pr-reviewer on this PR. Let me switch to PR Reviewer mode and perform a comprehensive review of the read_file history deduplication implementation. |
PR Review: feat: implement read_file history deduplication (#6280)Executive SummaryThis PR implements a read_file history deduplication feature that removes older duplicate file reads from the conversation history while preserving recent ones. The implementation is well-structured with comprehensive test coverage and follows established patterns in the codebase. ✅ Strengths
🔍 Code Quality ObservationsPattern ConsistencyThe implementation follows established patterns:
Implementation DetailsThe
📝 Minor Suggestions
✅ Verification
ConclusionThis is a well-implemented feature that addresses a real need for preventing duplicate file reads in conversation history. The code is clean, well-tested, and follows established patterns in the codebase. The experimental flag approach allows for safe rollout and testing. Recommendation: ✅ Approve - This PR is ready for merge. |
Update: Implemented Proactive File CachingAfter further testing, I discovered that the initial reactive approach wasn't working as expected. The deduplication was happening after files were already read and added to the context, which didn't reduce the context size. Changes in this update:
Benefits:
All tests are passing and the feature now works as intended. The cache time window remains configurable through the settings UI. |
Related GitHub Issue
Closes: #6279
Roo Code Task Context (Optional)
Description
This PR implements a read_file history deduplication feature to prevent duplicate file reads from accumulating in the conversation history. The implementation:
READ_FILE_DEDUPLICATIONthat defaults to disableddeduplicateReadFileHistorymethod in the Task class that removes older duplicate read_file results while preserving the most recent onesTest Procedure
readFileDeduplicationexperimental feature in settingscd src && npx vitest run core/task/__tests__/Task.spec.tsAll unit tests have been added to cover:
Pre-Submission Checklist
Screenshots / Videos
Not applicable - this is a backend feature with no UI changes.
Documentation Updates
The feature is experimental and disabled by default. When it's ready for general use, documentation should be updated to explain the feature.
Additional Notes
This implementation carefully preserves message structure and only removes duplicate read_file content blocks. It includes comprehensive error handling and maintains backward compatibility with older message formats.
Get in Touch
Important
Introduces read_file history deduplication feature with a new feature flag, implementation in
Task.ts, and comprehensive test coverage.READ_FILE_DEDUPLICATIONfeature flag inexperiment.ts.deduplicateReadFileHistory()inTask.tsto remove older duplicateread_fileresults, preserving recent ones.Task.spec.ts.deduplicateReadFileHistoryinreadFileTool.spec.ts.experiments.spec.tsto includeREAD_FILE_DEDUPLICATION.deduplicateReadFileHistory()inreadFileTool.tsafter successful reads.This description was created by
for 0fe3571. You can customize this summary. It will automatically update as commits are pushed.