-
Notifications
You must be signed in to change notification settings - Fork 605
Description
Problem Statement
Amazon Bedrock supports prompt caching to reduce inference latency and input token costs. However, manually managing cache points in messages is error-prone and requires users to understand the internals of message structure.
Currently, users must manually add cachePoint objects to their messages:
messages = [
{
"role": "user",
"content": [
{"text": "Some really long text!" * 1000},
{"cachePoint": {"type": "default"}}, # Manual cache point
],
}
]This approach has several drawbacks:
- Users need to understand when and where to place cache points
- Cache points remain in message history, which may not be desired
- The API doesn't provide built-in automatic cache point management
Proposed Solution
Implement a PromptCachingHook class that automatically:
- Adds a cache point to the last message before model invocation (via
BeforeModelCallEvent) - Removes the cache point after model invocation completes (via
AfterModelCallEvent) - since Bedrock limits cache points to a maximum of 4 per request, the hook removes the cache point after each invocation to avoid exceeding this limit and to keep the message history clean
This follows the "Simplified Cache Management" approach recommended in AWS Bedrock documentation.
The hook would work transparently with minimal user effort:
from strands import Agent
from strands.models import BedrockModel
from strands.hooks.bedrock import PromptCachingHook
model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514-v1:0")
agent = Agent(
model=model,
hooks=[PromptCachingHook()] # Simple one-line activation
)
agent("Hello, how are you?") # Cache point automatically managedUse Case
Users working with large documents that need to be cached can simply add the hook and let it manage cache points automatically.
Alternatives Solutions
No response
Additional Context
https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html#prompt-caching-simplified