-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat(anthropic): add support for prompt caching #2669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements Anthropic's prompt caching feature to improve token efficiency. - Adds cache control support in AnthropicApi and AnthropicChatModel - Creates AnthropicCacheType enum with EPHEMERAL cache type - Extends AbstractMessage and UserMessage to support cache parameters - Updates Usage tracking to include cache-related token metrics - Adds integration test to verify prompt caching functionality This implementation follows Anthropic's prompt caching API (beta-2024-07-31) which allows for more efficient token usage by caching frequently used prompts.
Hello, it seems right to me, thank you for updating the branch. |
@tzolov @Claudio-code Can we merge this? |
This implementation has a bug, the cache needs to be added only to the last message/tool on the list, you would need to make this change to be able to merge. langchain4j/langchain4j#3337 |
Without this fix the user will reach the cache limit faster than it should and this will generate an error in the anthropic API |
@Claudio-code Thanks for the ping. I am reviewing the PR right now and trying to merge. Is the bug you mentioned applicable to the user messages? The langchain code you pointed out is adding the last message for system messages. Thanks! |
@Claudio-code Are you planning to address this fix in this PR? In that case, feel free to take this PR and rebase against the latest |
It's because there people didn't want to add support for user messages because of the cache limitation of up to 4 cached items, the bug is in all messages and tools that can have cache |
I made fix #4100 But this implementation doesn't support cached system messages, but adding this support requires changing the structure because currently the system message is just a string and would need to be a ContentBlock instance. |
NOTE: This is a rebased version of the original #1413 PR by @Claudio-code.
@Claudio-code is this the original author for this PR:
Implements Anthropic's prompt caching feature to improve token efficiency.
This implementation follows Anthropic's prompt caching API (beta-2024-07-31) which allows for more efficient token usage by caching frequently used prompts.