-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
AWS Bedrock has a prompt caching feature, similar to Anthropic, that has the potential to significantly reduce the cost of using Roo-Code. Is it possible to use this feature today, if not, is it something that could be added? Here is some quick research / background (using Perplexity)
-
Enable prompt caching when making API calls to Amazon Bedrock. This can be done by adding the
explicitPromptCaching='enabled'
parameter to yourinvoke_model
request[2]. -
Structure your prompts to take advantage of caching:
- Ensure your prompts meet the minimum token requirement for creating cache checkpoints. For example, the Anthropic Claude 3.5 Sonnet v2 model requires at least 1,024 tokens[2].
- Use the
cache_control
property in your request body to specify which parts of the prompt should be cached[2].
-
Here's an example of how to structure your API call:
response = bedrock_client.invoke_model(
body={
"anthropic_version": "bedrock-2023-05-31",
"system": "Reply concisely",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Your main prompt here"
},
{
"type": "text",
"text": "Additional context to meet minimum token requirement",
"cache_control": {"type": "ephemeral"}
}
]
}
],
"max_tokens": 2048,
"temperature": 0.5
},
modelId=modelId,
accept=accept,
contentType=contentType,
explicitPromptCaching='enabled'
)[2]
-
Be aware that cache checkpoints have a five-minute Time To Live (TTL), which resets with each successful cache hit[2].
-
For more complex implementations, consider using Amazon Bedrock's features like Agents, which automatically handle prompt caching when enabled[2].
By implementing these steps, ROO Code can benefit from reduced costs (up to 90%) and improved latency (up to 85%) when using Bedrock models with prompt caching[1].
Citations:
[1] https://aws.amazon.com/bedrock/prompt-caching/
[2] https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
[3] langchain-ai/langchain#25610
[4] https://www.reddit.com/r/ClaudeAI/comments/1esto2i/anthropic_just_released_prompt_caching_making/
[5] https://api.python.langchain.com/en/latest/aws/llms/langchain_aws.llms.bedrock.Bedrock.html
[6] https://opentools.ai/news/aws-supercharges-bedrock-llm-service-with-prompt-routing-and-caching
[7] https://www.youtube.com/watch?v=2mNXSv7cTLI
[8] https://news.ycombinator.com/item?id=41284639
Originally posted by @mdlmarkham in #482
Metadata
Metadata
Assignees
Labels
Type
Projects
Status