feat: Add file read caching to prevent redundant reads in conversation history #4501

Mnehmos · 2025-06-10T15:20:08Z

PR for feature/4009-pr2-file-read-caching

Related GitHub Issue

Closes: #4009

Description

This PR introduces a file read caching service to improve performance by caching the content of files that have been read. This helps to avoid re-reading the same file multiple times during a conversation.

This PR specifically addresses the following feedback from the original implementation:

Memory management: A cache size limit and a First-In-First-Out (FIFO) eviction policy have been implemented using an LRU cache to prevent memory issues.
Error handling: Proper error handling has been added for file system operations when checking the modification time of a file.

Test Procedure

Run pnpm test to execute all tests and ensure that the new tests for the caching service pass and that there are no regressions.
Manually test file reading scenarios, including cache hits, cache misses, and scenarios where files are modified or deleted, to ensure the cache behaves as expected.

Type of Change

✨ New Feature: Non-breaking change that adds functionality.

Pre-Submission Checklist

Screenshots / Videos

N/A

Documentation Updates

None required.

Additional Notes

This is the second of three PRs to address the work in issue #4009.

Important

Introduces a file read caching service with LRU cache, integrates it with existing tools, and adds tests for improved performance and error handling.

Caching Service:
- Introduces fileReadCacheService.ts for caching file contents with LRU cache.
- Implements processAndFilterReadRequest() to manage cache hits and misses.
Error Handling:
- Adds error handling for file system operations in fileReadCacheService.ts.
Integration:
- Integrates caching with readFileTool.ts, applyDiffTool.ts, and writeToFileTool.ts.
- Updates readFileTool.ts to use cache results for reading files.
Testing:
- Adds tests in fileReadCacheService.spec.ts and readFileTool.test.ts for caching logic.
Miscellaneous:
- Adds lruCache.ts utility for cache management.
- Updates esbuild.mjs to clean assets directory.

^{This description was created by}^{for 2071612. You can customize this summary. It will automatically update as commits are pushed.}

atttempt 2.

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

…ve error handling

ellipsis-dev · 2025-06-10T15:24:43Z

src/core/tools/readFileTool.ts

+					// Track file read
+					await cline.fileContextTracker.trackFileContext(relPath, "read_tool" as RecordSource)
+
+					const stats = fs.statSync(fullPath)


Avoid using synchronous fs.statSync inside an async function to prevent blocking the event loop. Consider using fs.promises.stat for a non‐blocking alternative.

Suggested change

const stats = fs.statSync(fullPath)

const stats = await fs.promises.stat(fullPath)

- Implement MemoryAwareCache with 100MB limit and LRU eviction - Fix syntax error in processAndFilterReadRequest function - Add proper error handling for file permissions (EACCES, EPERM) - Handle file deletion scenarios by removing from cache - Add logging for cache evictions and errors - Update imports to use fs/promises for test compatibility All tests passing (12/12)

daniel-lxs

Hey @Mnehmos, thank you for taking this issue.

Overall I think it's a good idea to figure out if the issue needs a complex implementation like this to prevent reads on files that were recently read.

Keeping the content of the files in a cache to determine if a file read needs to be rejected might be a bit of an overkill.

I would like to hear your thoughts about this.

daniel-lxs · 2025-06-10T17:50:51Z

src/core/prompts/tools/read-file.ts

 	const isMultipleReadsEnabled = maxConcurrentReads > 1

 	return `## read_file
-Description: Request to read the contents of ${isMultipleReadsEnabled ? "one or more files" : "a file"}. The tool outputs line-numbered content (e.g. "1 | const x = 1") for easy reference when creating diffs or discussing code.${args.partialReadsEnabled ? " Use line ranges to efficiently read specific portions of large files." : ""} Supports text extraction from PDF and DOCX files, but may not handle other binary files properly.


I couldn't find mentions about why partial reads are being permanently enabled. Is this change intentional? since partial reads can be disabled in the settings.

This change is intentional and addresses Issue #4009. Let me clarify what's actually happening:

The Problem Being Solved:
The "Always read entire file" setting (maxReadFileLine = -1) was prohibiting line-range reads entirely, forcing users to always read complete files even when they had specific line numbers from:

git grep -n results
Compiler/linter error messages
search_files output
Manual diffs with line references
What This Change Does:

Preserves existing behavior: When no <line_range> is specified, entire files are still read
Adds intelligent choice: Model can now choose line ranges when contextually appropriate
Maintains the setting's intent: "Always read entire file" becomes the default, not an absolute restriction
Technical Detail:
Previously: partialReadsEnabled = maxReadFileLine !== -1 meant unlimited readers couldn't see line-range options
Now: Line ranges are always available in the tool interface, letting the model make smart decisions based on context

This transforms a rigid limitation into flexible intelligence - the model gets entire files by default but can target specific lines when it has line numbers to work with.

daniel-lxs · 2025-06-10T17:52:42Z

src/core/services/fileReadCacheService.ts

Is there a simpler way of checking if the recently read file hasn't changed? on codebase indexing we use hashes to verify that the content of the file hasn't changed. If the file read is being rejected do we need to keep a cache of the whole file or would keeping a hash be a better option?

After reviewing the code, I need to clarify the current implementation:

Current Implementation Reality:

The cache does NOT store full file content
It only stores metadata: { mtime: string, size: number }
Cache decisions are based on conversation history analysis + mtime comparison
Memory tracking is for metadata size limits, not content storage
Your Hash Suggestion Benefits:

More Reliable: Hashes detect actual content changes vs mtime manipulation
Already Proven: Works well in your codebase indexing
Potentially Simpler: Could replace mtime + conversation history analysis
Current Approach Issues:

mtime comparison can miss cases where file content changes but timestamp is preserved
Conversation history parsing is complex
Still requires file stat calls
Hash-Based Alternative:

interface HashCacheEntry {
hash: string; // Content hash
lastRead: number; // Timestamp of last read
lineRanges: LineRange[]; // Ranges read at this hash
}

typescript

Trade-off Question:
The hash approach requires reading files to generate hashes, which adds I/O cost. However, it provides stronger content change detection than mtime alone.

Would the hash generation cost be acceptable given the improved reliability and potential to simplify the conversation history analysis logic?

Mnehmos and others added 5 commits June 5, 2025 17:23

RooCodeInc#4009

49bf0c7

atttempt 2.

Update src/core/tools/readFileTool.ts

b20bd7b

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

feat(file-reads): Implement file-read caching

44ea42f

feat(file-reads): Implement file-read caching

03ef66a

feat(cache): implement cache size limit and eviction policy and impro…

2071612

…ve error handling

Mnehmos requested review from cte, jr and mrubens as code owners June 10, 2025 15:20

github-project-automation bot added this to Roo Code Roadmap Jun 10, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Jun 10, 2025

github-project-automation bot added this to Roo Code Roadmap Jun 10, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Jun 10, 2025

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Jun 10, 2025

ellipsis-dev bot reviewed Jun 10, 2025

View reviewed changes

Mnehmos added 2 commits June 10, 2025 09:34

fix(cache):adress memory management and error handling feedback 2

a9fff91

daniel-lxs changed the title ~~Feature/4009 pr2 file read caching~~ feat: Add file read caching to prevent redundant reads in conversation history Jun 10, 2025

hannesrudolph mentioned this pull request Jun 10, 2025

feat: Add file read caching to prevent redundant reads in conversation history #4393

Closed

23 tasks

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jun 10, 2025

daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jun 10, 2025

daniel-lxs reviewed Jun 10, 2025

View reviewed changes

daniel-lxs moved this from PR [Needs Prelim Review] to PR [Draft / In Progress] in Roo Code Roadmap Jun 10, 2025

daniel-lxs marked this pull request as draft June 10, 2025 18:07

hannesrudolph added the PR - Draft / In Progress label Jun 11, 2025

Mnehmos closed this Jun 12, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Jun 12, 2025

github-project-automation bot moved this from PR [Draft / In Progress] to Done in Roo Code Roadmap Jun 12, 2025

Mnehmos deleted the feature/4009-pr2-file-read-caching branch June 12, 2025 02:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add file read caching to prevent redundant reads in conversation history #4501

feat: Add file read caching to prevent redundant reads in conversation history #4501

Uh oh!

Mnehmos commented Jun 10, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot Jun 10, 2025

Uh oh!

daniel-lxs left a comment

Uh oh!

daniel-lxs Jun 10, 2025

Uh oh!

Mnehmos Jun 10, 2025

Uh oh!

daniel-lxs Jun 10, 2025

Uh oh!

Mnehmos Jun 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	const stats = fs.statSync(fullPath)
	const stats = await fs.promises.stat(fullPath)

feat: Add file read caching to prevent redundant reads in conversation history #4501

feat: Add file read caching to prevent redundant reads in conversation history #4501

Uh oh!

Conversation

Mnehmos commented Jun 10, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR for feature/4009-pr2-file-read-caching

Related GitHub Issue

Description

Test Procedure

Type of Change

Pre-Submission Checklist

Screenshots / Videos

Documentation Updates

Additional Notes

Uh oh!

ellipsis-dev bot Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs left a comment

Choose a reason for hiding this comment

Uh oh!

daniel-lxs Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Mnehmos Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Mnehmos Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Mnehmos commented Jun 10, 2025 •

edited by ellipsis-dev bot

Loading