-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: prevent browser tool image accumulation to avoid AWS Bedrock 20-image limit #6452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -214,6 +214,7 @@ export class Task extends EventEmitter<ClineEvents> { | |
|
|
||
| // Computer User | ||
| browserSession: BrowserSession | ||
| private lastBrowserScreenshotMessageId?: string // Track the last message with browser screenshot | ||
|
|
||
| // Editing | ||
| diffViewProvider: DiffViewProvider | ||
|
|
@@ -508,6 +509,57 @@ export class Task extends EventEmitter<ClineEvents> { | |
| await this.saveApiConversationHistory() | ||
| } | ||
|
|
||
| /** | ||
| * Add a browser action result to conversation history, removing previous browser screenshots | ||
| * to prevent hitting provider image limits (e.g., AWS Bedrock's 20-image limit). | ||
| */ | ||
| async addBrowserActionToApiHistory( | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider adding more detailed JSDoc documentation explaining the AWS Bedrock 20-image limitation, how this method differs from addToApiConversationHistory, and when this should be used vs the standard method. |
||
| toolResult: string | Array<Anthropic.TextBlockParam | Anthropic.ImageBlockParam>, | ||
| ) { | ||
| // Remove previous browser screenshot from conversation history | ||
| if (this.lastBrowserScreenshotMessageId) { | ||
| // Find and remove images from the last browser action message | ||
| for (let i = this.apiConversationHistory.length - 1; i >= 0; i--) { | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Potential race condition: If multiple browser actions happen rapidly, this search could find the wrong message or become inconsistent. Consider adding a more robust identification mechanism or ensuring browser actions are properly serialized.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Performance concern: This searches backwards through the entire conversation history on every browser action. For long conversations, this could become slow. Could we optimize this by storing a direct reference to the message instead of searching, using a more efficient lookup mechanism, or limiting the search scope? |
||
| const message = this.apiConversationHistory[i] | ||
| if (message.role === "user" && Array.isArray(message.content)) { | ||
| // Check if this message contains the last browser screenshot | ||
| const hasToolResult = message.content.some( | ||
| (block) => block.type === "text" && block.text.includes("[browser_action Result]"), | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider extracting this magic string as a constant to improve maintainability and reduce the risk of typos. |
||
| ) | ||
| if (hasToolResult) { | ||
| // Remove image blocks from this message, keep only text blocks | ||
| message.content = message.content.filter((block) => block.type === "text") | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing error handling: What happens if message.content is malformed or doesn't contain the expected structure? Consider adding defensive checks to prevent runtime errors.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider adding debug logging here to help troubleshoot issues. Something like console.debug would be helpful for debugging when the feature doesn't work as expected. |
||
| break | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Add the new browser action result | ||
| const content = Array.isArray(toolResult) ? toolResult : [{ type: "text" as const, text: toolResult }] | ||
| const messageWithTs = { | ||
| role: "user" as const, | ||
| content, | ||
| ts: Date.now(), | ||
| } | ||
|
|
||
| // Track this message if it contains images | ||
| const hasImages = Array.isArray(toolResult) && toolResult.some((block) => block.type === "image") | ||
| if (hasImages) { | ||
| this.lastBrowserScreenshotMessageId = messageWithTs.ts.toString() | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this be optimized? Storing the timestamp directly instead of converting to string might be more efficient for comparisons and memory usage. |
||
| } | ||
|
|
||
| this.apiConversationHistory.push(messageWithTs) | ||
| await this.saveApiConversationHistory() | ||
| } | ||
|
|
||
| /** | ||
| * Reset browser screenshot tracking when browser is closed | ||
| */ | ||
| resetBrowserScreenshotTracking() { | ||
| this.lastBrowserScreenshotMessageId = undefined | ||
| } | ||
|
|
||
| async overwriteApiConversationHistory(newHistory: ApiMessage[]) { | ||
| this.apiConversationHistory = newHistory | ||
| await this.saveApiConversationHistory() | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test coverage for this critical functionality. Could we add unit tests to verify that:
This is essential since this prevents API errors that halt workflows.