Replies: 2 comments 2 replies
-
callback is your best bet, but:
|
Beta Was this translation helpful? Give feedback.
2 replies
-
Well OpenRouter has two ways to get accurate billing:
Happy to collaborate on this, we at Kodu have migrated our entire agent stack to be built on top of litellm but now we are pretty much stuck because we can't accurately display the token usage in our vscode extension nor we can accurately bill the user due to cost and token estimates miss match (on openrouter and gemini 2.5 when prompt caching is enabled) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I searched for information in issues, in discussions and in documentation. It feels like cost_completion streaming requests are not charged in any way. But if you try to get the cost through a callback, then there is a cost. But I haven't figured out how callbacks work yet. Can you tell me where the feature that makes the callback work is implemented? I want to understand how response_cost is calculated there.
In code documentation, cost_completion only works with regular requests, I haven't found what to do with streaming (even if each chunk is sent separately, it doesn't work).
Beta Was this translation helpful? Give feedback.
All reactions