Skip to content

Conversation

@hannesrudolph
Copy link
Collaborator

@hannesrudolph hannesrudolph commented Jun 16, 2025

Description

Fixes #4745

This PR improves AWS Bedrock error handling by addressing two critical issues:

  1. Better throttling error detection: The error message "Bedrock is unable to process your request" is now properly identified as a throttling error
  2. Enhanced streaming error handling: Throttling errors in streaming contexts now throw immediately (without yielding error chunks), enabling the retry mechanism to work properly

Changes Made

  • src/api/providers/bedrock.ts:

    • Added "bedrock is unable to process your request" to throttling error patterns (line 1051)
    • Enhanced streaming error handler to differentiate throttling vs. non-throttling errors (lines 560-577)
    • For throttling errors: throw immediately without yielding chunks (enables retry)
    • For non-throttling errors: maintain existing behavior (yield error chunks then throw)
  • src/api/providers/__tests__/bedrock-error-handling.spec.ts:

    • Created comprehensive test suite with 15 test cases
    • Covers all error detection scenarios including the specific "Bedrock is unable to process" message
    • Tests streaming context error handling for both throttling and non-throttling errors
    • Validates error priority and specificity rules

Testing

  • All existing tests pass
  • Added 15 new tests covering error handling scenarios:
    • Throttling error detection from various sources (HTTP status, AWS error types, message patterns)
    • Specific test for "Bedrock is unable to process your request" message
    • Streaming context error handling (throttling vs. non-throttling)
    • Error priority validation
    • Service quota, model readiness, and token limit detection
  • Manual testing completed:
    • Throttling errors now properly trigger retry mechanism
    • "Bedrock is unable to process your request" shows appropriate user guidance

Verification of Acceptance Criteria

  • Criterion 1: AWS throttling errors with message "Bedrock is unable to process your request" are now properly identified as THROTTLING errors, display appropriate guidance, and trigger auto-retry when enabled
  • Criterion 2: Throttling errors during streaming are handled gracefully by throwing immediately (without yielding error chunks), allowing the retry mechanism to work while providing proper user messaging

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Comments added for complex logic
  • No breaking changes
  • Comprehensive test coverage added
  • All tests passing
  • Linting passes
  • TypeScript checks pass

This fix significantly improves the user experience when encountering AWS Bedrock throttling errors by providing proper error identification, user guidance, and automatic retry functionality.


Important

Improves AWS Bedrock error handling by enhancing throttling detection and streaming error management, with comprehensive tests added.

  • Behavior:
    • Improved throttling error detection in bedrock.ts by recognizing "Bedrock is unable to process your request" as a throttling error.
    • Enhanced streaming error handling in bedrock.ts to immediately throw throttling errors, enabling retry mechanisms.
    • Non-throttling errors in streaming contexts yield error chunks before throwing.
  • Error Handling:
    • Updated getErrorType() in bedrock.ts to prioritize error types and include new patterns for throttling and other errors.
    • Added comprehensive error handling tests in bedrock-error-handling.spec.ts with 15 test cases covering various error scenarios.
  • Testing:
    • Added tests for throttling detection from HTTP status, AWS error types, and specific messages.
    • Validated error handling in streaming contexts for both throttling and non-throttling errors.

This description was created by Ellipsis for 88e2eae. You can customize this summary. It will automatically update as commits are pushed.


bedrock.retry.fix.mp4

@hannesrudolph
Copy link
Collaborator Author

I have tested this and it works for me.

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 16, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jun 16, 2025
- Throw enhanced error messages instead of original errors in createMessage() and completePrompt()
- Users now see detailed error codes, request IDs, and troubleshooting guidance during retry countdowns
- Add comprehensive tests to verify enhanced error messages flow to retry system
- Resolves issue #4745 where retry messages only showed basic errors instead of verbose details
Copy link
Collaborator Author

@hannesrudolph hannesrudolph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! The implementation successfully addresses both issues from #4745:

Throttling detection is improved with the new "Bedrock is unable to process your request" pattern
Streaming error handling now properly differentiates between throttling and non-throttling errors
Test coverage is comprehensive with 15 well-structured test cases
Enhanced error messages provide valuable debugging information

The code quality is solid and the approach is sound. I've left a few comments for consideration:

  1. Pattern specificity for throttling detection
  2. Error property validation for robustness
  3. Verification of streaming behavior with retry mechanism
  4. Additional edge case test coverage
  5. Configurability of error message verbosity

These are mostly suggestions for potential improvements rather than blocking issues. The PR effectively solves the reported problems and improves the user experience when encountering AWS Bedrock throttling errors.

Great work on the comprehensive test suite - it really helps validate the implementation! 🎯

@hannesrudolph hannesrudolph moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jun 16, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Jun 16, 2025
- Made 'unable to process your request' pattern more specific by removing generic version
- Added property validation before accessing error properties to prevent undefined access
- Added comment explaining streaming retry behavior for throttling errors
- Added comprehensive edge case tests (concurrent errors, mixed scenarios, property validation)
- Added configuration option awsBedrockVerboseErrors to control error message verbosity
- Defaults to verbose errors for backward compatibility
@hannesrudolph
Copy link
Collaborator Author

✅ All review comments have been addressed in commit 72128ff:

  1. Pattern specificity: Removed the generic "unable to process your request" pattern, keeping only the Bedrock-specific version
  2. Property validation: Added type guards to validate error properties exist before accessing them
  3. Streaming behavior verification: Added clarifying comment about retry mechanism expectations
  4. Edge case tests: Added comprehensive tests for concurrent errors, mixed scenarios, and property validation
  5. Verbose error configurability: Added new awsBedrockVerboseErrors setting (defaults to true for backward compatibility)

All tests are passing and the code has been linted successfully.

@daniel-lxs
Copy link
Member

I noticed that the awsBedrockVerboseErrors setting is being added but:

  1. No UI exists to configure it - users can't actually change this setting
  2. It defaults to true - meaning verbose errors are always enabled
  3. This could expose sensitive information - verbose errors include request IDs, extended request IDs, CloudFront IDs, and detailed error metadata

Since the main goal of this PR is to improve throttling error detection and streaming error handling, I'd suggest removing the verbose error feature entirely for now. This would:

  • Keep the PR focused on its core objective (better error detection)
  • Avoid introducing an incomplete feature that users can't control
  • Prevent potential security concerns from exposing detailed error information by default
  • Simplify the implementation

The improved error type detection and streaming error handling are valuable on their own. The verbose error feature could be added in a separate PR with proper UI controls if needed later.

What do you think about removing the verbose error code and just keeping the improved error detection logic?

Copy link
Collaborator Author

@hannesrudolph hannesrudolph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the verbose error logging as discussed. The error messages are now cleaner.

Regarding the other points:

  • The type guards for status and $metadata were already in place.
  • I've kept the "unable to process your request" pattern for now as it's a specific message from Bedrock, but we can monitor for false positives.
  • I agree that tests for concurrent/mixed errors would be valuable and can be added in a follow-up PR to keep this one focused.

@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Needs Prelim Review] in Roo Code Roadmap Jun 22, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left 2 minor suggestions for this PR.

Basically just a bit of cleanup.

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Jun 23, 2025
@delve-auditor
Copy link

delve-auditor bot commented Jun 24, 2025

We have finished reviewing your PR. We have found no vulnerabilities.

Reply to this PR with @delve-auditor followed by a description of what change you want and we'll auto-submit a change to this PR to implement it.

@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Needs Review] in Roo Code Roadmap Jun 24, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed the requested comments

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 24, 2025
@mrubens mrubens merged commit ee751af into main Jun 24, 2025
17 checks passed
@mrubens mrubens deleted the fix/issue-4745-bedrock-error-handling branch June 24, 2025 15:37
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Jun 24, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Jun 24, 2025
cte pushed a commit that referenced this pull request Jun 24, 2025
Alorse pushed a commit to Alorse/Roo-Code that referenced this pull request Jun 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer PR - Needs Review size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

AWS Bedrock: "Unknown Error" not triggering retry mechanism and poor error identification for throttling

4 participants