Skip to content

Conversation

@samhvw8
Copy link
Contributor

@samhvw8 samhvw8 commented Apr 23, 2025

Context

Implementation

Screenshots

before after

How to Test

Get in Touch


Important

Enhances read_file tool with multi-file support, adds maxConcurrentFileReads setting, and updates UI components and tests accordingly.

  • Behavior:
    • read_file tool now supports reading multiple files with line ranges, improving efficiency.
    • Introduces maxConcurrentFileReads setting in global-settings.ts to limit concurrent file reads.
  • UI Components:
    • Adds BatchFilePermission component for handling multi-file read permissions in ChatView.
    • Updates ContextManagementSettings and SettingsView to include maxConcurrentFileReads slider.
  • Tests:
    • Adds tests for BatchFilePermission in BatchFilePermission.test.tsx.
    • Updates existing tests to handle new multi-file read logic in readFileTool.test.ts.
  • Localization:
    • Updates settings.json for English and Japanese locales to include new settings descriptions.

This description was created by Ellipsis for 62330ebfae61e7c2c1c4d669ab5cef93e46a00d0. You can customize this summary. It will automatically update as commits are pushed.

@changeset-bot
Copy link

changeset-bot bot commented Apr 23, 2025

⚠️ No Changeset found

Latest commit: 940c543

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 23, 2025
@ellipsis-dev
Copy link
Contributor

ellipsis-dev bot commented Apr 23, 2025

The changes in this pull request seem to be related to enhancing the file reading capabilities and updating the tool usage format. The changes across different files appear to be interconnected, focusing on a single feature enhancement. Therefore, it doesn't seem necessary to split the pull request into smaller ones. If you have any questions or need further clarification, feel free to reach out!

@dosubot dosubot bot added the enhancement New feature or request label Apr 23, 2025
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Range read is handled only when both start_line and end_line are provided. Confirm if this is the intended behavior. If a user provides only one of these parameters (e.g., just start_line), consider whether it should read from start_line to the end of the file.

@hannesrudolph hannesrudolph moved this from New to PR [Pre Approval Review] in Roo Code Roadmap Apr 23, 2025
@KJ7LNW
Copy link
Contributor

KJ7LNW commented Apr 24, 2025

  • please convert this PR to a draft so it does not get merged accidentally
  • @cte this will need run through an eval before merging, but I think it has great promise to reduce token usage (ie, the n^2/2 for sequential file reads), and increase focus because all the information is provided through a single request.

@samhvw8 samhvw8 marked this pull request as draft April 25, 2025 00:23
@samhvw8
Copy link
Contributor Author

samhvw8 commented Apr 25, 2025

@KJ7LNW converted, @mrubens @cte we need an threshold of maximum file reading, should we based on model information for threshold ?

@KJ7LNW
Copy link
Contributor

KJ7LNW commented Apr 25, 2025

@KJ7LNW converted

Excellent please force push so I can look when you get a chance.

we need an threshold of maximum file reading, should we based on model information for threshold ?

I think first pass should respect existing line limit handling per file to keep the implementation simple . After we have been using it for a while we may discover different ways to handle length issues.

Please make sure that "list code definitions" triggers when the length of the file is less than the maximum line limit as it does now

@samhvw8 samhvw8 marked this pull request as ready for review April 25, 2025 13:52
@samhvw8 samhvw8 force-pushed the feat/multi-read-file-tool branch from d73e5a7 to f868b67 Compare April 25, 2025 13:53
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 25, 2025
@samhvw8
Copy link
Contributor Author

samhvw8 commented Apr 25, 2025

@KJ7LNW i just update the test, now it all pass

@KJ7LNW
Copy link
Contributor

KJ7LNW commented Apr 26, 2025

<args>
:path:src/app.ts
:start_line:1
:end_line:50
======+++======
:path:src/utils.ts
:start_line:100
:end_line:150
======+++======
:path:package.json
</args>

It looks like you have custom argument formats which do not allow us to programmatically change the parser or automate tool XML structure integration as part of the tool refactor that @bramburn is working on in #2467 . Ultimately all tools will provide standardized argument structure so that no tool has special parsing for trivial data types (int, string, etc). The long term goal is to automatically generate system instructions, parsing, and tool documentation based on per-tool JSON-looking XML schema definitions standardized in a static attribute for each tool class.

Please choose a standard XML format, so we can standardize all tool arguments as part of the Tool base class in the future and avoid much fine your code to conform. Something like this would be appropriate:

<read_file>
<file path="/path1">
<range lines="1-100"/>
<range lines="210-22"/>
</file>
<file path="/path2">
<range lines="11-111"/>
<range lines="222-333"/>
</file>
</read_file>

or if you do not like attributes:

<read_file>
<file>
<path>/path/to/file1</path>
<line_range>111-222</line_range>
<line_range>333-444</line_range>
</file>
<file>
<path>/path/to/file2</path>
<line_range>1111-2222</line_range>
<line_range>3333-4444</line_range>
</file>
</read_file>

@samhvw8
Copy link
Contributor Author

samhvw8 commented Apr 26, 2025

@KJ7LNW if we have format like that another llm can do it ?
I will make another branch for xml style

@KJ7LNW
Copy link
Contributor

KJ7LNW commented Apr 26, 2025

if we have format like that another llm can do it ?

LLMs work quite well with XML at the moment, I am not sure if I understood your question, can you elaborate?

@samhvw8
Copy link
Contributor Author

samhvw8 commented Apr 27, 2025

@KJ7LNW i make it work with all xml, fixing test then push to branch 💪🏻

@KJ7LNW
Copy link
Contributor

KJ7LNW commented Apr 27, 2025

💪🏻!!!

I'll check it out

@bramburn
Copy link
Contributor

bramburn commented Apr 27, 2025 via email

@samhvw8 samhvw8 force-pushed the feat/multi-read-file-tool branch 2 times, most recently from 69c7823 to 025f7cd Compare April 27, 2025 11:00
@samhvw8
Copy link
Contributor Author

samhvw8 commented Apr 27, 2025

@mrubens @cte @KJ7LNW i just updated to all xml

@KJ7LNW
Copy link
Contributor

KJ7LNW commented Apr 28, 2025

the code looks good, I am going to try it out in real life and see what I discover ...

@daniel-lxs daniel-lxs moved this from PR [Needs Review] to PR [Changes Requested] in Roo Code Roadmap May 30, 2025
@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Needs Review] in Roo Code Roadmap May 30, 2025
@KJ7LNW
Copy link
Contributor

KJ7LNW commented May 30, 2025

FYI, I have been running this on my desk for a couple of weeks for everything that I do, this is the feedback of my experience:

Performance issues to monitor:

* If users report slowdowns when reading 10+ files at once, we might need to consider adding concurrency limiting

I have not found any slowdown, and I have had times where 15-20 files were loaded at once. it is probably 10x faster than loading them individually.

* Watch for memory spikes - the current implementation loads all files into memory simultaneously

Fortunately, this happens on the backend, not in the web view.

* Keep an eye on VS Code becoming unresponsive during large batch reads

I have never seen that problem it has been nice and snappy.

Things that might break:

* Very large files (100MB+) being read in batches could cause OOM errors

That is an issue for single file reads, too, because a file that big would exceed available context for all existing models today; if you are using read-size limits, then this is not an issue. Read limits default to 500 lines, but it may have changed recently.

* No validation on total bytes being read - someone could accidentally read gigabytes (while this is also a potential issue with our current `read_file`, this implementation makes it easier for this to happen)

If this is going to be addressed it needs to be addressed per-file. We have line-limit constraints available to handle this already there is probably not a better way to do it.

Quick fixes to consider post-merge:

1. Add a simple concurrency limit (even just 5 concurrent file ops would help)

There already is. I keep mine at 100 and have never had a problem:

image

2. Track total bytes being read and warn/limit if it's too much

I disagree with this because we already have line limit handling that users can enforce. "too much" is too subjective, it will very by user model and use case so let's keep the existing line-limit behavior.

3. Make the error messages clearer when batch operations fail

is there a specific error message you are concerned about?

4. Add telemetry to see how people actually use this feature

Good idea.

Can this be merged and add telemetry in a separate PR?

@KJ7LNW
Copy link
Contributor

KJ7LNW commented May 30, 2025

When checked for the first time this should probably default to a higher number?

Screenshot 2025-05-30 at 11 33 07 AM

@samhvw8 can you set this to 15 by default immediately when experimental flag is enabled? Users can then adjust it from there.

@daniel-lxs
Copy link
Member

@KJ7LNW, Can you check again the default value for concurrent reads? I made a change regarding that specifically.

The value is set to 15 for me when I enable the feature.

@KJ7LNW
Copy link
Contributor

KJ7LNW commented May 30, 2025

@KJ7LNW, Can you check again the default value for concurrent reads? I made a change regarding that specifically.

The value is set to 15 for me when I enable the feature.

If that is what it does for you, then it must be correct, as I am going off slightly older information.

sorry for any confusion

Copy link
Collaborator

@mrubens mrubens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label May 30, 2025
@mrubens mrubens merged commit 9ba0cd5 into RooCodeInc:main May 30, 2025
10 checks passed
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap May 30, 2025
@github-project-automation github-project-automation bot moved this from PR [Pre Approval Review] to Done in Roo Code Roadmap May 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer PR - Needs Review size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Add read_files Tool for Batch File Reading

9 participants