-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Jbbrown/bedrock caching #2071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jbbrown/bedrock caching #2071
Conversation
* main: Require exactly node v20.18.1 (RooCodeInc#2065) Handle null [x]ModelInfo types, recover from settings schema parse errors (RooCodeInc#2064) feat: prioritize "Add to Context" and add line number tracking (RooCodeInc#2063) Extract code for read_file from Cline (RooCodeInc#2059)
|
|
This pull request is quite large, with 20 files changed and over 4500 lines added. It includes a variety of changes such as documentation updates, caching strategy implementations, and test enhancements. To make the review process more manageable, could you consider splitting this pull request into smaller, more focused ones? For example:
This will help reviewers focus on specific areas and ensure a more thorough review process. Thank you! |
…raction from Arns. Change example of ARN use from the foundational model ARN to an inference profile ARN which is what is needed.
|
@Premshay do you have any time to look at this? I know you were looking at Bedrock caching previously. Thank you! |
| awsUsePromptCache?: boolean | ||
| setAwsUsePromptCache: (value: boolean) => void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add this to extension state, or can it just be a field in the apioptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I guess I'm confused - can you explain what the change is here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I don't know what "setAwsUsePRomptCache" is either, it seemed to be a pattern for input checkboxes. awsUsePromptCache is leveraged as a user decision if they want or don't want to use prompt caching when it's available for that model. It seemed reasonable as a user option, but I also don't have a use case specifically in mind for not using caching. So from a product design standpoint would be ok with an opinionated answer that it's simply driven off model support. Let me know what the Roo-Code team decides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after review - setAwsUsePromptCache is needed to maintain the extension state in-between views, assuming we want to keep a user choice on prompt cache enable/disabled
See: webview-ui/src/context/ExtensionStateContext.tsx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Atlogit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks very extensive and thought out. And right on time, since caching has only just went to general availibilty. Looking forward to trying it out.
|
@Smartsheet-JB-Brown well done. This is greatly appreciated. |
src/api/providers/bedrock.ts
Outdated
| * | ||
| *************************************************************************************/ | ||
|
|
||
| private static readonly REGION_INFO: Record< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this duplicative of the AWS_REGIONS constant? Could we align on adding any necessary info to that and using it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's in the webview-UI - could we have a shared constant somewhere though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With some work we could. The one in bedrock.ts is more extensive than the UI list because it has AZs with Bedrock support and we'd need to add properties on a combined constant that would allow webview-UI to filter to the right list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry.. not AZs, but alternative region prefixes. I'll take a shot at combining though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've consolidated AWS Region constants into src/shared/aws_regions.ts and set the UI Region drop down options to be a filtered, sorted list of the full data set for bedrock ARN abbreviations.
review please @mrubens
| /** | ||
| * Check if a token count meets the minimum threshold for caching | ||
| */ | ||
| protected meetsMinTokenThreshold(tokenCount: number): boolean { | ||
| const minTokens = this.config.modelInfo.minTokensPerCachePoint | ||
| if (!minTokens) { | ||
| return false | ||
| } | ||
| return tokenCount >= minTokens | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we only using this for the system prompt? Seems unlikely that the Roo system prompt would ever dip below the minimum, but maybe I lack imagination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep you are right about the system prompt. I think I was using this method originally in other places but ended up removing those calls. Who knows about future model minimums though.
| supportsImages: true, | ||
| supportsComputerUse: false, | ||
| supportsPromptCache: false, | ||
| supportsPromptCache: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that some of these models are still in Preview, at least according to https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html#prompt-caching-models. Will this break anything for people who are not in the preview?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bedrock ignores the cachePoint nodes that are invalid. There's an extension state option to use or not use caching as well, so there shouldn't be any blocking issue even that wasn't the case. right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there really is an extension state option for this - I don't see it in the UI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh I see!
|
That's a good question. It may lead to validation exception. I don't know
for certain since I haven't had a chance to try it myself.
Better to leave prompt caching true only to 3.7 and 3.5 haiku for now.
Better yet to have prompt caching as a toggle, and then this becomes a more
simple matter.
…On Wed, Apr 2, 2025, 20:09 Matt Rubens ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/shared/api.ts
<#2071 (comment)>:
> @@ -113,11 +113,14 @@ export const bedrockModels = {
contextWindow: 300_000,
supportsImages: true,
supportsComputerUse: false,
- supportsPromptCache: false,
+ supportsPromptCache: true,
I noticed that some of these models are still in Preview, at least
according to
https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html#prompt-caching-models.
Will this break anything for people who are not in the preview?
—
Reply to this email directly, view it on GitHub
<#2071 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGWMILCATYQFZAHMGIOF44D2XQKWBAVCNFSM6AAAAAB2AQ2SG2VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDOMZXGAYTGMBWGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
@Atlogit - Re: Content Blocks - Maybe in a few weeks. In my personal Roo-Code use they are not as common, and I could only get this far during private preview use of prompt caching and I have a bit of OOO time starting in 2 days, so I wanted to get this out, handle any issues that might arise, and then determine the next most valuable thing. |
mrubens
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀

-- Don't merge until AWS SDK is updated to have cacheReadTokens and cacheWriteTokens in the Bedrock response's usage object. At that point the package.json change can be reverted to a normal SDK reference.
Add AWS Bedrock Cache Strategy Implementation
Overview
This PR implements a cache strategy system for AWS Bedrock API requests, optimizing token usage and improving response times by strategically placing cache points throughout conversations.
AWS Prompt Caching is currently a private preview feature, but anticipated to be public soon. My org has had the opportunity to use it with this branch of Roo-Code for a couple weeks with good results.
Detailed Bedrock Cache Strategy Documentation.
Key Features
MultiPointStrategyfor optimal cache point placement in conversationsImplementation Details
The cache strategy system is designed to optimize the placement of cache points in AWS Bedrock API requests. Cache points allow the service to reuse previously processed parts of the prompt, reducing token usage and improving response times.
Cache Strategy Components
Placement Logic
Documentation
For detailed information about the implementation, including class relationships, sequence diagrams, and examples, please refer to the Bedrock Cache Strategy Documentation.
Testing
Performance Impact
Initial testing shows:
Next Steps
Important
Implements a caching strategy for AWS Bedrock API requests, optimizing token usage and response times with a new
MultiPointStrategyand updates toAwsBedrockHandler.MultiPointStrategyinmulti-point-strategy.tsfor optimal cache point placement in AWS Bedrock API requests.AwsBedrockHandlerinbedrock.ts.bedrockModelsinapi.tsto include caching capabilities and pricing details.minTokensPerCachePoint,maxCachePoints, andcachableFieldsto model configurations.MultiPointStrategyincache-strategy.test.ts.AwsBedrockHandlerinbedrock.test.tsandbedrock-custom-arn.test.ts.bedrock-cache-strategy-documentation.mddetailing the cache strategy implementation.package.jsonto use a local AWS SDK package for testing.ApiOptions.tsxfor enabling prompt caching.This description was created by
for 34220ee. It will automatically update as commits are pushed.