Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Jul 30, 2025

Fixes #6348

Problem

When using the browser tool continuously in a single chat session, screenshots accumulate in the conversation history. AWS Bedrock has a 20-image limit, causing "too many images" errors after 21 screenshots, which halts the workflow and requires starting a new chat.

Solution

This PR implements @daniel-lxs's recommendation to only send the most recent screenshot instead of accumulating all screenshots in conversation history.

Changes Made:

  • Added addBrowserActionToApiHistory() method to Task class that removes previous browser screenshots before adding new ones
  • Modified browserActionTool to use the new method instead of direct addToApiConversationHistory
  • Added resetBrowserScreenshotTracking() method to clear tracking when browser is closed
  • Added proper ESLint-compliant block scoping for variable declarations

How it works:

  1. When a browser action produces a screenshot, the new method removes images from the previous browser action message in the conversation history
  2. Only the most recent screenshot is kept in the conversation history sent to the AI model
  3. Screenshots remain visible in the UI for user reference
  4. When the browser is closed, screenshot tracking is reset

Testing:

  • All existing tests pass
  • TypeScript compilation successful
  • ESLint checks pass

This change prevents hitting provider image limits while maintaining the browser tool's functionality and user experience.


Important

Introduces methods to manage browser screenshots in Task.ts and updates browserActionTool.ts to prevent exceeding AWS Bedrock's 20-image limit.

  • Behavior:
    • addBrowserActionToApiHistory() in Task.ts removes previous screenshots before adding new ones to avoid AWS Bedrock's 20-image limit.
    • resetBrowserScreenshotTracking() in Task.ts clears screenshot tracking when the browser is closed.
    • browserActionTool() in browserActionTool.ts uses addBrowserActionToApiHistory() to manage screenshot history.
  • Misc:
    • Adds block scoping for variable declarations to comply with ESLint.

This description was created by Ellipsis for 008e833. You can customize this summary. It will automatically update as commits are pushed.

…image limit

- Add addBrowserActionToApiHistory method to Task class that removes previous browser screenshots before adding new ones
- Modify browserActionTool to use new method instead of direct addToApiConversationHistory
- Add resetBrowserScreenshotTracking method to clear tracking when browser is closed
- Fixes issue #6348 where continuous browser tool usage would accumulate screenshots and hit AWS Bedrock's 20-image limit

This implements @daniel-lxs's recommendation to only send the most recent screenshot instead of accumulating all screenshots in conversation history.
@roomote roomote bot requested review from cte, jr and mrubens as code owners July 30, 2025 22:26
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Jul 30, 2025
Copy link
Contributor Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for implementing this solution to prevent AWS Bedrock's 20-image limit! The approach of keeping only the most recent screenshot is sound and addresses the core issue. However, I've identified several areas that need attention before this can be merged safely.

* Add a browser action result to conversation history, removing previous browser screenshots
* to prevent hitting provider image limits (e.g., AWS Bedrock's 20-image limit).
*/
async addBrowserActionToApiHistory(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for this critical functionality. Could we add unit tests to verify that:

  1. Previous images are properly removed from conversation history
  2. The tracking ID is correctly updated
  3. Edge cases like malformed content are handled gracefully

This is essential since this prevents API errors that halt workflows.

// Remove previous browser screenshot from conversation history
if (this.lastBrowserScreenshotMessageId) {
// Find and remove images from the last browser action message
for (let i = this.apiConversationHistory.length - 1; i >= 0; i--) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential race condition: If multiple browser actions happen rapidly, this search could find the wrong message or become inconsistent. Consider adding a more robust identification mechanism or ensuring browser actions are properly serialized.

// Remove previous browser screenshot from conversation history
if (this.lastBrowserScreenshotMessageId) {
// Find and remove images from the last browser action message
for (let i = this.apiConversationHistory.length - 1; i >= 0; i--) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance concern: This searches backwards through the entire conversation history on every browser action. For long conversations, this could become slow. Could we optimize this by storing a direct reference to the message instead of searching, using a more efficient lookup mechanism, or limiting the search scope?

)
if (hasToolResult) {
// Remove image blocks from this message, keep only text blocks
message.content = message.content.filter((block) => block.type === "text")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling: What happens if message.content is malformed or doesn't contain the expected structure? Consider adding defensive checks to prevent runtime errors.

)
if (hasToolResult) {
// Remove image blocks from this message, keep only text blocks
message.content = message.content.filter((block) => block.type === "text")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding debug logging here to help troubleshoot issues. Something like console.debug would be helpful for debugging when the feature doesn't work as expected.

// Track this message if it contains images
const hasImages = Array.isArray(toolResult) && toolResult.some((block) => block.type === "image")
if (hasImages) {
this.lastBrowserScreenshotMessageId = messageWithTs.ts.toString()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be optimized? Storing the timestamp directly instead of converting to string might be more efficient for comparisons and memory usage.

* Add a browser action result to conversation history, removing previous browser screenshots
* to prevent hitting provider image limits (e.g., AWS Bedrock's 20-image limit).
*/
async addBrowserActionToApiHistory(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding more detailed JSDoc documentation explaining the AWS Bedrock 20-image limitation, how this method differs from addToApiConversationHistory, and when this should be used vs the standard method.

if (message.role === "user" && Array.isArray(message.content)) {
// Check if this message contains the last browser screenshot
const hasToolResult = message.content.some(
(block) => block.type === "text" && block.text.includes("[browser_action Result]"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting this magic string as a constant to improve maintainability and reduce the risk of typos.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jul 30, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jul 31, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Jul 31, 2025
@daniel-lxs
Copy link
Member

Closing, see #6348 (comment)

@daniel-lxs daniel-lxs closed this Jul 31, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Jul 31, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Prelim Review] to Done in Roo Code Roadmap Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working PR - Needs Preliminary Review size:M This PR changes 30-99 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Unhandled "too many images" error from AWS Bedrock when using browser tool

4 participants