-
Notifications
You must be signed in to change notification settings - Fork 10.4k
(Work In Progress) Exploration of how to show token usage #609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@wonderwhy-er Which LLM are you using to test it? As far as a I know your approach correspond to OpenAI models only. Also I think that the request must contain a |
|
nah, its returning nan |
It may not work with all providers but google and openai 100% return it I also just tested Antropic, OpenRouter, Cohere, Together - and they work. Test Groq, HuggingFace and for them it does not work. So, initially it will work only for some providers. |
On client yet. On server depending on the provider it does return. |
|
if we move the llm call to client, can we still use the vercel's AI sdk. its very easy but also restricted |
|
Vercel SDK works on client too. I tested it against LM studio and it worked. I will start by moving LM Studio call to client, then ollama. |
|
trying to find from which line its coming, but its not in terminal nor in console Edit: |
|
I suggest using this approach, it will automatically get appended to the assistant message and can be used without any regex parsing |
|
Same error as above on Windows 11 except I am able to see the token usage in the terminal. 1.mp4 |




This PR adds the ability to show token usage in messages for enhanced transparency and monitoring of API consumption.
The key features include:
Token Usage Display: Display token count for input and output in each message.
Technical implementation and exploration:
Its bit hackish but hard to get to better variant in reasonable amount of time, may be someone can come later and improve better than I could.
