Assistant: Is Anthropic cache working correctly? #8616
Unanswered
andreatitolo
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Most of those caching PRs went in after we set up the last regular monthly release. Would you be up for switching to a daily build, to see if you still observe that same difference? If that's not a good fit for you, our next regular monthly release will go out around Aug 4. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi!
Really liking the assistant so far! One thing I noticed is that while the Anthropic cache system is working (as per multiple merged PR), I still see a large use of input tokens instead of cache read/write in Anthropic console when using the Claude through the chat, mostly when dealing with long conversations.
You can see a comparison in these two pictures below.
This is one using the Continue extension, and a long conversation of around 3,030,010 tokens
Expand
This is instead a conversation with positron assistant, little more that half the tokens of the continue one (1,764,200), but the less use of cache results in almost the same cost as the previous one.
Expand
I know comparing tools is not exactly the best as each has its own implementation, but that seems quite a strong difference.
I can open an issue if appropriate.
Positron version and OS
Beta Was this translation helpful? Give feedback.
All reactions