Skip to content

Conversation

@krystiancha
Copy link
Contributor

These two commits add token usage info to responses of embedding and rerank endpoints.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate if you can add a test case for it. See test_embedding.py

@github-actions github-actions bot added the python python script changes label Dec 17, 2024
@krystiancha
Copy link
Contributor Author

Tests added

@ggerganov ggerganov merged commit 05c3a44 into ggml-org:master Dec 17, 2024
1 check passed
@krystiancha krystiancha deleted the fill-usage branch December 17, 2024 21:02
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Dec 20, 2024
…#10852)

* server : fill usage info in embeddings response

* server : fill usage info in reranking response
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
…#10852)

* server : fill usage info in embeddings response

* server : fill usage info in reranking response
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
…#10852)

* server : fill usage info in embeddings response

* server : fill usage info in reranking response
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants