server : Add verbose output to OAI compatible chat endpoint. #12246

mglambda · 2025-03-07T11:39:24Z

I noticed that the /chat/completions and /v1/completions endpoints do not return the "__verbose" field in the final server response when running llama-server with -lv 10 and streaming enabled. This is inconsistent with the non-streaming behaviour of those endpoints.

This pr adds verbose output to server_task_result_cmpl_final::to_json_oaicompat_chat_stream, making it conform with server_task_result_cmpl_final::to_json_oaicompat_chat, as well as the other to_json methods.

This was motivated by me wanting to know the tokens_cached in streaming mode on those endpoints. If anyone has another way of doing that, preferably without verbose mode, I'd be very interested.

ngxson · 2025-03-07T22:06:55Z

I can merge only if you can fix the CI

mglambda · 2025-03-08T06:57:53Z

Ok, let me try to figure out how to run the CI locally.

mglambda · 2025-03-08T09:44:31Z

Ok I saw the whitespace problem. Hope things are fine now. Baby's first pr.

Add verbose output to server_task_result_cmpl_final::to_json_oaicompat_chat_stream, making it conform with server_task_result_cmpl_final::to_json_oaicompat_chat, as well as the other to_json methods.

mglambda requested a review from ngxson as a code owner March 7, 2025 11:39

github-actions bot added examples server labels Mar 7, 2025

mglambda requested a review from ggerganov as a code owner March 8, 2025 06:30

github-actions bot added script Script related devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Mar 8, 2025

mglambda force-pushed the server-verbose-result-oaicompat-stream branch from 7a54db2 to 46cca0d Compare March 8, 2025 09:21

mglambda requested a review from JohannesGaessler as a code owner March 8, 2025 09:21

mglambda force-pushed the server-verbose-result-oaicompat-stream branch from 46cca0d to 35d63f1 Compare March 8, 2025 09:41

server : Add verbose output to OAI compatible chat endpoint.

d261b13

Add verbose output to server_task_result_cmpl_final::to_json_oaicompat_chat_stream, making it conform with server_task_result_cmpl_final::to_json_oaicompat_chat, as well as the other to_json methods.

mglambda force-pushed the server-verbose-result-oaicompat-stream branch from 35d63f1 to d261b13 Compare March 23, 2025 15:55

ngxson approved these changes Mar 23, 2025

View reviewed changes

ngxson merged commit 77f9c6b into ggml-org:master Mar 23, 2025
47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server : Add verbose output to OAI compatible chat endpoint. #12246

server : Add verbose output to OAI compatible chat endpoint. #12246

mglambda commented Mar 7, 2025

Uh oh!

ngxson commented Mar 7, 2025

Uh oh!

mglambda commented Mar 8, 2025

Uh oh!

mglambda commented Mar 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

server : Add verbose output to OAI compatible chat endpoint. #12246

server : Add verbose output to OAI compatible chat endpoint. #12246

Conversation

mglambda commented Mar 7, 2025

Uh oh!

ngxson commented Mar 7, 2025

Uh oh!

mglambda commented Mar 8, 2025

Uh oh!

mglambda commented Mar 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants