feat: UI-safe request formatting to avoid embedding file contents #8697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

roomote wants to merge 1 commit into main from feature/no-ui-file-content-in-ui-messages

Contributor

roomote bot commented Oct 17, 2025 •

edited by ellipsis-dev bot

Loading

Implements a cleaner approach for UI messages to prevent bloated ui_messages.json:

Add formatContentBlockForUi() to summarize content blocks safely
- Redacts diffs and large bodies
- Summarizes XML to file paths/errors/notices only
- Truncates long text (2000 chars default)
- Summarizes tool_use parameters (paths/regex/file_pattern), not bodies
Replace request preview composition in Task to use the UI-safe formatter for:
- api_req_started initial preview
- api_req_started final update before streaming

Functional scope:

UI message cleanliness only; apiConversationHistory/export stays full-fidelity for model use.

Follow-ups (optional):

Consider making truncation limit configurable
Add unit tests around diff redaction, XML summarization (escaped/unescaped), and tool_use summaries.

Important

Introduces formatContentBlockForUi() to safely format content blocks for UI messages, applied in Task.ts for cleaner api_req_started messages.

Behavior:
- Adds formatContentBlockForUi() in formatContentBlockForUi.ts to summarize content blocks for UI messages.
- Replaces formatContentBlockToMarkdown() with formatContentBlockForUi() in Task.ts for api_req_started messages.
- Redacts diffs, summarizes XML to paths/errors, truncates long text, and summarizes tool_use parameters.
Functional Scope:
- Affects UI message formatting only; full data retained for apiConversationHistory and exports.
Follow-ups:
- Consider configurable truncation limit.
- Add unit tests for diff redaction, XML summarization, and tool_use summaries.

^{This description was created by}^{for 598d7a3. You can customize this summary. It will automatically update as commits are pushed.}


          feat(webview): prevent UI bloat by not embedding file contents in UI …

598d7a3

…messages; add formatContentBlockForUi() and use for api_req_started

roomote bot requested review from cte, jr and mrubens as code owners

October 17, 2025 06:08

github-project-automation bot moved this to Triage in Roo Code Roadmap

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap

github-project-automation bot moved this to New in Roo Code Roadmap

dosubot bot added the size:L label

Contributor Author

roomote bot commented Oct 17, 2025 •

edited

Loading

Review Summary

I've identified 3 issues that need to be addressed:

Line 91: Fix duplicate condition in sanitizeText - should check for both <files and <files to detect escaped XML
Lines 112-114: Fix duplicate diff marker checks in looksLikeDiff - should check for both unescaped and escaped versions
Line 126 (and 161): Fix HTML entity decoding order in decode function - & must be decoded last to properly handle double-encoded entities

dosubot bot added the UI/UX label

ellipsis-dev bot reviewed

View reviewed changes

src/shared/formatContentBlockForUi.ts

+              	if (!text) return ""
+              	// If this looks like a files XML, summarize paths/errors/notices and drop content bodies
+              	if (text.includes("<files") || text.includes("<files")) {

Contributor

ellipsis-dev bot Oct 17, 2025

Duplicate condition found in sanitizeText: checking text.includes('<files') twice. Remove the redundant check.

Suggested change

      
            	if (text.includes("<files") || text.includes("<files")) {
          
            	if (text.includes("<files")) {

src/shared/formatContentBlockForUi.ts

+              function looksLikeDiff(s: string): boolean {
+              	return (
+              		s.includes("<<<<<<< SEARCH") ||

Contributor

ellipsis-dev bot Oct 17, 2025

Redundant diff marker checks in looksLikeDiff: the same strings are checked twice. Consider removing the duplicate includes for '<<<<<<< SEARCH' and '>>>>>>> REPLACE'.

src/shared/formatContentBlockForUi.ts

+               */
+              function summarizeFilesXml(xmlLike: string): string {
+              	// Support both escaped and unescaped tags
+              	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

Contributor

ellipsis-dev bot Oct 17, 2025

The decode function does not actually decode HTML entities. It replaces '<' with '<', etc. Consider using proper replacement (e.g. replacing '<' with '<', '>' with '>', '&' with '&').

Suggested change

      
            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
          
            	const decode = (s: string) => s.replace(/&lt;/g, '<').replace(/&gt;/g, '>').replace(/&amp;/g, '&')

github-advanced-security bot found potential problems

View reviewed changes

src/shared/formatContentBlockForUi.ts

+               */
+              function summarizeFilesXml(xmlLike: string): string {
+              	// Support both escaped and unescaped tags
+              	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

Check warning

Code scanning / CodeQL

Replacement of a substring with itself Medium

This replaces '<' with itself.

Copilot Autofix

AI 3 months ago

To resolve this, we should properly decode XML/HTML entities in the string: replace < with <, > with >, and & with & (and possibly more, but these cover the standard cases given the context of parsing XML-like content). This can be done by chaining .replace() calls, or (for broader correctness) by using a well-known library such as he for HTML entity decoding. Since only standard entities are mentioned and only the shown code can be changed, the best fix is to replace the identity replacements with proper entity replacements in the decode helper function, on line 126.

Suggested changeset 1

src/shared/formatContentBlockForUi.ts

@@ -123,7 +123,11 @@
              */
             function summarizeFilesXml(xmlLike: string): string {
             	// Support both escaped and unescaped tags
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	const decode = (s: string) =>
+            		s
+            			.replace(/&lt;/g, "<")
+            			.replace(/&gt;/g, ">")
+            			.replace(/&amp;/g, "&");
             	const raw = decode(xmlLike)
             	const fileRegex = /<file>([\s\S]*?)<\/file>/g

Copilot is powered by AI and may make mistakes. Always verify output.

Unable to commit as this autofix suggestion is now outdated

src/shared/formatContentBlockForUi.ts

+               */
+              function summarizeFilesXml(xmlLike: string): string {
+              	// Support both escaped and unescaped tags
+              	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

Check warning

Code scanning / CodeQL

Replacement of a substring with itself Medium

This replaces '>' with itself.

Copilot Autofix

AI 3 months ago

To fix this problem, update the decode function (on line 126) to properly decode common XML entities. Specifically, replace all occurrences of < with <, > with >, and & with &. This ensures that both escaped and unescaped XML tags are handled as described in the comment, and the decoder correctly converts XML entity-encoded tags to their character equivalents for further processing. The changes are localized to the decode function, so edit line 126 in the file src/shared/formatContentBlockForUi.ts to swap the current replacement calls for the correct ones.

No additional imports or dependencies are required, and the fix is self-contained.

Suggested changeset 1

src/shared/formatContentBlockForUi.ts

@@ -123,7 +123,10 @@
              */
             function summarizeFilesXml(xmlLike: string): string {
             	// Support both escaped and unescaped tags
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	const decode = (s: string) =>
+            		s.replace(/&lt;/g, "<")
+            		 .replace(/&gt;/g, ">")
+            		 .replace(/&amp;/g, "&")
             	const raw = decode(xmlLike)
             	const fileRegex = /<file>([\s\S]*?)<\/file>/g

Copilot is powered by AI and may make mistakes. Always verify output.

Unable to commit as this autofix suggestion is now outdated

src/shared/formatContentBlockForUi.ts

+               */
+              function summarizeFilesXml(xmlLike: string): string {
+              	// Support both escaped and unescaped tags
+              	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

Check warning

Code scanning / CodeQL

Replacement of a substring with itself Medium

This replaces '&' with itself.

Copilot Autofix

AI 3 months ago

The best fix is to replace the decode function with one that properly decodes XML/HTML entities, converting sequences like < to <, > to >, and & to &, as well as any other encountered entities. Since TypeScript/JavaScript does not provide a standard, reliable method for entity decoding, but the well-known library he (html-entities is also popular) can accomplish this robustly. In the context of this file, we only need to decode a string, so importing and using a decoder from a widely used library is the best way to meet the requirement. Therefore, replace the ad-hoc decode function with an implementation using he.decode. This change is limited to src/shared/formatContentBlockForUi.ts. If importing is not possible, you may use a minimum viable manual decoding for the three most common entities (<, >, &).

Suggested changeset 2

src/shared/formatContentBlockForUi.ts

@@ -123,7 +123,8 @@
              */
             function summarizeFilesXml(xmlLike: string): string {
             	// Support both escaped and unescaped tags
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	import { decode as heDecode } from "he";
+            	const decode = (s: string) => heDecode(s);
             	const raw = decode(xmlLike)
             	const fileRegex = /<file>([\s\S]*?)<\/file>/g

src/package.json

Outside changed files

@@ -521,7 +521,8 @@
             		"web-tree-sitter": "^0.25.6",
             		"workerpool": "^9.2.0",
             		"yaml": "^2.8.0",
-            		"zod": "^3.25.61"
+            		"zod": "^3.25.61",
+            		"he": "^1.2.0"
             	},
             	"devDependencies": {
             		"@roo-code/build": "workspace:^",

This fix introduces these dependencies

Package	Version	Security advisories
he (npm)	1.2.0	None

Copilot is powered by AI and may make mistakes. Always verify output.

Unable to commit as this autofix suggestion is now outdated

src/shared/formatContentBlockForUi.ts

+              }
+              function extractPathsFromXml(xml: string): string[] {
+              	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

Check warning

Code scanning / CodeQL

Replacement of a substring with itself Medium

This replaces '<' with itself.

Copilot Autofix

AI 3 months ago

The main issue is that the decode function is defined as follows:

const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

This does nothing, because it replaces "<" with "<", and so on. The function should decode XML/HTML entity escapes such as <, >, and & to their respective literal characters, so that subsequent regex matches against tags like <file> work whether the tags are escaped or not.

General repair:
Replace the decode function definition with one that actually decodes <, >, and & to <, >, and &, respectively.

Best way (detailed):

Update the decode function in summarizeFilesXml (line 126) and in extractPathsFromXml (line 161) so they replace escaped XML/HTML entities (<, >, &) with the corresponding literal characters (<, >, &).
Do not change other code or functionality, only fix the decoding.
No new import is needed; simple string replace suffices for these cases.

Suggested changeset 1

src/shared/formatContentBlockForUi.ts

@@ -123,7 +123,8 @@
              */
             function summarizeFilesXml(xmlLike: string): string {
             	// Support both escaped and unescaped tags
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	const decode = (s: string) =>
+            		s.replace(/&lt;/g, "<").replace(/&gt;/g, ">").replace(/&amp;/g, "&")
             	const raw = decode(xmlLike)
             	const fileRegex = /<file>([\s\S]*?)<\/file>/g
@@ -158,7 +159,8 @@
             }
             function extractPathsFromXml(xml: string): string[] {
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	const decode = (s: string) =>
+            		s.replace(/&lt;/g, "<").replace(/&gt;/g, ">").replace(/&amp;/g, "&")
             	const raw = decode(xml)
             	const pathRegex = /<path>([\s\S]*?)<\/path>/g
             	const paths: string[] = []

Copilot is powered by AI and may make mistakes. Always verify output.

Unable to commit as this autofix suggestion is now outdated

src/shared/formatContentBlockForUi.ts

+              }
+              function extractPathsFromXml(xml: string): string[] {
+              	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

Check warning

Code scanning / CodeQL

Replacement of a substring with itself Medium

This replaces '>' with itself.

Copilot Autofix

AI 3 months ago

The bug exists in the decode functions (lines 126 and 161) where .replace(/</g, "<").replace(/>/g, ">") replaces characters with themselves — a useless operation. The intention appears to be decoding the XML-escaped forms (<, >, and &), converting XML character entities back to their original characters. The correct replacements should therefore be:

.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/&/g, "&")

To fix this:

On lines 126 and 161, update both decode functions to replace <, >, and & with <, >, and & respectively, instead of attempting to replace literal <, >, and & with themselves.
No imports are required.
Only change the decode lambda body in both places, not other code.

Suggested changeset 1

src/shared/formatContentBlockForUi.ts

@@ -123,7 +123,10 @@
              */
             function summarizeFilesXml(xmlLike: string): string {
             	// Support both escaped and unescaped tags
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	const decode = (s: string) => s
+            		.replace(/&lt;/g, "<")
+            		.replace(/&gt;/g, ">")
+            		.replace(/&amp;/g, "&")
             	const raw = decode(xmlLike)
             	const fileRegex = /<file>([\s\S]*?)<\/file>/g
@@ -158,7 +161,10 @@
             }
             function extractPathsFromXml(xml: string): string[] {
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	const decode = (s: string) => s
+            		.replace(/&lt;/g, "<")
+            		.replace(/&gt;/g, ">")
+            		.replace(/&amp;/g, "&")
             	const raw = decode(xml)
             	const pathRegex = /<path>([\s\S]*?)<\/path>/g
             	const paths: string[] = []

Copilot is powered by AI and may make mistakes. Always verify output.

Unable to commit as this autofix suggestion is now outdated

src/shared/formatContentBlockForUi.ts

+              }
+              function extractPathsFromXml(xml: string): string[] {
+              	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

Check warning

Code scanning / CodeQL

Replacement of a substring with itself Medium

This replaces '&' with itself.

Copilot Autofix

AI 3 months ago

To fix the problem, the decode function should replace entity-encoded representations (such as &, <, >) with their corresponding normal characters (&, <, >). Therefore, within the decode function, update the replacement logic to convert both < and > entities, as well as & to &. Specifically, change the replacements so that .replace(/&/g, "&") becomes .replace(/&/g, "&"), and (optionally) update < and > handling for XML entities if necessary. Make changes only in the code region defining the decode function within summarizeFilesXml and extractPathsFromXml (lines 126 and 161).

Suggested changeset 1

src/shared/formatContentBlockForUi.ts

@@ -123,7 +123,7 @@
              */
             function summarizeFilesXml(xmlLike: string): string {
             	// Support both escaped and unescaped tags
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	const decode = (s: string) => s.replace(/&amp;/g, "&").replace(/</g, "<").replace(/>/g, ">")
             	const raw = decode(xmlLike)
             	const fileRegex = /<file>([\s\S]*?)<\/file>/g
@@ -158,7 +158,7 @@
             }
             function extractPathsFromXml(xml: string): string[] {
-            	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")
+            	const decode = (s: string) => s.replace(/&amp;/g, "&").replace(/</g, "<").replace(/>/g, ">")
             	const raw = decode(xml)
             	const pathRegex = /<path>([\s\S]*?)<\/path>/g
             	const paths: string[] = []

Copilot is powered by AI and may make mistakes. Always verify output.

Unable to commit as this autofix suggestion is now outdated

roomote bot commented

View reviewed changes

src/shared/formatContentBlockForUi.ts

+              	if (!text) return ""
+              	// If this looks like a files XML, summarize paths/errors/notices and drop content bodies
+              	if (text.includes("<files") || text.includes("<files")) {

Contributor Author

roomote bot Oct 17, 2025

This condition checks for the unescaped <files tag twice. The second check should be for the escaped version <files to properly detect escaped XML in text content. Without this, escaped XML tags won't be summarized and could bloat the UI messages.

roomote bot commented

View reviewed changes

src/shared/formatContentBlockForUi.ts

Comment on lines +112 to +114

+              		s.includes("<<<<<<< SEARCH") ||
+              		s.includes(">>>>>>> REPLACE") ||
+              		s.includes("<<<<<<< SEARCH") ||

Contributor Author

roomote bot Oct 17, 2025

These lines check for the same unescaped diff markers twice. To properly handle both escaped and unescaped content, the second set should check for escaped versions (e.g., <<<<<<< SEARCH and >>>>>>> REPLACE) to detect diff content that may have been HTML-escaped in the text.

roomote bot commented

View reviewed changes

src/shared/formatContentBlockForUi.ts

+               */
+              function summarizeFilesXml(xmlLike: string): string {
+              	// Support both escaped and unescaped tags
+              	const decode = (s: string) => s.replace(/</g, "<").replace(/>/g, ">").replace(/&/g, "&")

Contributor Author

roomote bot Oct 17, 2025

The entity decoding order is incorrect. When decoding HTML entities, & must be decoded LAST, otherwise double-encoded entities won't decode properly. For example, with the current order, &lt; becomes < (still escaped) instead of < (correct). The correct order should be: .replace(/</g, '<').replace(/>/g, '>').replace(/&/g, '&'). This same issue exists in the decode function at line 161.

hannesrudolph added the Issue/PR - Triage label

hannesrudolph closed this

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap

github-project-automation bot moved this from New to Done in Roo Code Roadmap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

ellipsis-dev[bot] ellipsis-dev[bot] left review comments

mrubens Awaiting requested review from mrubens mrubens is a code owner

cte Awaiting requested review from cte cte is a code owner

jr Awaiting requested review from jr jr is a code owner

Labels

Issue/PR - Triage size:L UI/UX