Refactor: Improve file tool context formatting and diff error guidance by hannesrudolph · Pull Request #2278 · RooCodeInc/Roo-Code

hannesrudolph · 2025-04-03T22:16:30Z

This PR includes two related improvements to tool interactions and context formatting:

Refactor read_file Tool Result Formatting (src/core/Cline.ts):
- Modifies how results from the read_file tool are formatted when added to the AI model's conversation history (user context).
- Previous Format: [read_file for 'path/to/file'] Result:\n{content}
- New Format:
```
<file>
  <path>path/to/file</path>
  <content>
{content_or_error}
  </content>
</file>
```
- This change only affects read_file results; other tool formats are unchanged.
Improve apply_diff Error Message Clarity (src/core/diff/strategies/multi-search-replace.ts - Commit 73da4d03):
- Updates the error message returned by the multi-search-replace diff strategy when no sufficiently similar match is found.
- The "Tip" within the error message now more clearly instructs the user (or AI) to use the read_file tool to get the latest file content before retrying the apply_diff tool, explicitly mentioning both tools.

Reasoning and Benefits:

Structured File Context: The XML formatting for read_file results provides a clearer, more structured representation of file path and content within the AI's context. This may improve the model's ability to parse and utilize file information reliably.
Enhanced Error Guidance: The updated error message in the diff strategy provides more explicit guidance when a diff fails due to potential content mismatches. By specifically mentioning both read_file and apply_diff, it helps steer the AI towards the correct recovery action (refreshing its knowledge of the file before trying the diff again).
Improved Robustness: Together, these changes aim to make interactions involving file reading and modification more robust by improving context clarity and providing better error recovery instructions.

Important

Refactor read_file tool to use XML formatting and improve apply_diff error guidance for better tool interaction.

Behavior:
- Refactor read_file tool result formatting in Cline.ts to XML format for structured representation.
- Update apply_diff error message in multi-search-replace.ts to guide using read_file before retrying.
Tests:
- Update read-file-maxReadFileLine.test.ts to expect XML formatted results.
Misc:
- Modify readFileTool.ts to push XML formatted results.

^{This description was created by}^{for 8cd6c20. It will automatically update as commits are pushed.}

Modifies how the result of the `read_file` tool is presented in the conversation history sent to the AI model. Previously, the format was: [read_file for 'path/to/file'] Result: {content_or_error} This commit changes the format to use XML tags for better structure and potentially easier parsing by the model: <file> <path>path/to/file</path> <content> {content_or_error} </content> </file> This change only affects the `read_file` tool result formatting within the user context message constructed in `src/core/Cline.ts`. Other tool result formats remain unchanged.

…strategy Refines the error message returned when no sufficiently similar match is found during the multi-search-replace operation. The message now includes a clearer instruction to use the read_file tool for obtaining the latest file content before attempting to apply the diff again.

changeset-bot · 2025-04-03T22:16:34Z

⚠️ No Changeset found

Latest commit: 8cd6c20

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

mrubens · 2025-04-04T05:01:45Z

I think we could just make this change by adjusting what we pass into pushToolResult in readFileTool.ts

hannesrudolph · 2025-04-04T05:02:27Z

Ok get it done! :) :P

Modifies the `readFileTool` function to format the output as XML, enhancing the structure of the returned file content. This change aligns with previous updates to ensure consistent result formatting across tools.

This update refines the handling of tool responses in the `Cline` class by removing the XML formatting for `read_file` results and consolidating the logic for pushing results to the user message content. The changes ensure that all tool results are processed uniformly, improving code clarity and maintainability.

This commit modifies the assertions in the `read_file` tool tests to check for the expected XML format in the results. The changes ensure that the output structure aligns with recent updates to the tool's response formatting, enhancing test accuracy and reliability.

src/core/__tests__/read-file-maxReadFileLine.test.ts

…able This commit updates the `read_file` tool tests to utilize a predefined variable for the expected XML output, improving readability and maintainability of the test code. The changes ensure consistency in the expected results across multiple test cases.

KJ7LNW · 2025-04-05T21:43:58Z

IMHO, this was merged prematurely. I have reiterated the most important items (from the review above) that should be addressed before merging:

1. program tail output is incorrect because of unnecessary XML indentation:

Indentation is pretty for humans, but you will confuse the model it will think that \n that precedes </content> is actually part of the file.

I think this is better:
const xmlResult = `<file><path>${filePath}</path><content>\n${fileContent}</content></file>`

2. Errors are shown as file content which could confuse the model

We do not want the model to think that the file contains the error text, but the structured markup does imply that as it is currently implemented.

<content>\n${fileContent}</content> should only contain actual file content. if you look back a few lines there are other messages that should be managed as separate optional XML tags:
  content += `\n\n[Showing only ${maxReadFileLine} of ${totalLines} total lines. Use start_line and end_line if you need to read more]${sourceCodeDef}`
should become
  xmlInfo += `<notice>Showing only ${maxReadFileLine} of ${totalLines} total lines. Use start_line and end_line if you need to read more</notice>
  xmlInfo +=  `<list_code_definition_names>${sourceCodeDef}</list_code_definition_names>`
do the same thing with any errors as
xmlInfo += `<error>foo</error>`
and then final result becomes:
const xmlResult = `<file><path>${filePath}</path><content>\n${fileContent}</content>${xmlInfo}</file>`
this gives us a fully structured response that we can use as a new standard.

mrubens · 2025-04-05T21:48:44Z

Yeah that's good feedback. I like your suggestions.

KJ7LNW · 2025-04-05T23:06:25Z

I am going to make a pr for this stay tuned ...

* fix: addLineNumbers handling of empty content Empty files should not have line numbers, but non-empty files with empty content at a specific line offset should. - If content is empty, return empty string for empty files - If content is empty but startLine > 1, return line number for empty content at that offset This ensures that the model does not think the file contains a single empty line. Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> * refactor: improve readFileTool XML output format - Remove unnecessary XML indentation that could confuse the model - Separate file content from notices and errors using dedicated tags - Add line range information to content tags - Handle empty files properly with self-closing tags - Add comprehensive test coverage Fixes #2278 Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> * fix: always show line numbers in read_file XML output - Always display line numbers in non-range reads - Improve XML formatting with consistent newlines for better readability Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> * test: update tests to match new XML format with line numbers - Update test expectations to match the new XML format with newlines - Update tests to expect line numbers attribute in content tags - Modify test assertions to check for the correct line range values Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> * fix: consistent blank line handling in addLineNumbers - Add newline to all output - Handle trailing newlines and empty lines consistently - Add test cases for blank lines: - Multiple blank lines within content - Multiple trailing blank lines - Only blank lines with offset - Trailing newlines Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> * test: use actual addLineNumbers in read-file-xml tests - Modified extract-text mock to preserve actual addLineNumbers implementation - Removed mock implementation of addLineNumbers - Updated test data to account for trailing newline - Removed unnecessary mock verification Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> * test: ensure actual addLineNumbers function is called in tests - Replace direct mocking of addLineNumbers with spy on actual implementation - Add verification to ensure the real function is called when appropriate - Add skipAddLineNumbersCheck option for cases where function should not be called - Update test cases to use appropriate verification options - Fix numberedFileContent to include trailing newline for consistency Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> * fix: modify readLines to process data directly instead of line by line - Direct data processing provides more accurate results by preserving exact content with carriage returns - Improved performance through minimal buffering and efficient string operations - Use string indexes to find newlines while maintaining their original format - Handle all edge cases correctly with preserved line endings - Add tests for various edge cases including empty files, single lines, and different line endings Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> * test: remove unused mockInputContent variable Remove unused variable declaration to appease ellipsis-dev linter requirements. Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> --------- Signed-off-by: Eric Wheeler <roo-code@z.ewheeler.org> Co-authored-by: Eric Wheeler <roo-code@z.ewheeler.org>

hannesrudolph added 2 commits April 3, 2025 15:59

github-project-automation bot added this to Roo Code Roadmap Apr 3, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Apr 3, 2025

hannesrudolph changed the title ~~Apply diif fix 1~~ fix(multi-search-replace): Improve error message & refactor(read_file): Change result format to XML Apr 3, 2025

hannesrudolph changed the title ~~fix(multi-search-replace): Improve error message & refactor(read_file): Change result format to XML~~ Refactor: Improve file tool context formatting and diff error guidance Apr 3, 2025

hannesrudolph marked this pull request as ready for review April 3, 2025 23:24

hannesrudolph requested review from cte and mrubens as code owners April 3, 2025 23:24

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. Enhancement New feature or request labels Apr 3, 2025

hannesrudolph added 2 commits April 4, 2025 12:28

refactor: Update readFileTool to return results in XML format

92815f8

Modifies the `readFileTool` function to format the output as XML, enhancing the structure of the returned file content. This change aligns with previous updates to ensure consistent result formatting across tools.

dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Apr 4, 2025

dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Apr 4, 2025

ellipsis-dev bot reviewed Apr 4, 2025

View reviewed changes

src/core/__tests__/read-file-maxReadFileLine.test.ts Outdated Show resolved Hide resolved

mrubens approved these changes Apr 4, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 4, 2025

mrubens merged commit 951cefc into main Apr 4, 2025
14 checks passed

github-project-automation bot moved this from New to Done in Roo Code Roadmap Apr 4, 2025

mrubens deleted the apply_diif_fix_1 branch April 4, 2025 22:26

KJ7LNW mentioned this pull request Apr 6, 2025

refactor: improve readFileTool XML output format #2340

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: Improve file tool context formatting and diff error guidance#2278

Refactor: Improve file tool context formatting and diff error guidance#2278
mrubens merged 6 commits intomainfrom
apply_diif_fix_1

hannesrudolph commented Apr 3, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

changeset-bot bot commented Apr 3, 2025 •

edited

Loading

Uh oh!

mrubens commented Apr 4, 2025

Uh oh!

hannesrudolph commented Apr 4, 2025

Uh oh!

Uh oh!

Uh oh!

KJ7LNW commented Apr 5, 2025 •

edited

Loading

Uh oh!

mrubens commented Apr 5, 2025

Uh oh!

KJ7LNW commented Apr 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hannesrudolph commented Apr 3, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

mrubens commented Apr 4, 2025

Uh oh!

hannesrudolph commented Apr 4, 2025

Uh oh!

Uh oh!

Uh oh!

KJ7LNW commented Apr 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrubens commented Apr 5, 2025

Uh oh!

KJ7LNW commented Apr 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hannesrudolph commented Apr 3, 2025 •

edited by ellipsis-dev bot

Loading

changeset-bot bot commented Apr 3, 2025 •

edited

Loading

KJ7LNW commented Apr 5, 2025 •

edited

Loading