- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.4k
server : various fixes #10704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server : various fixes #10704
Conversation
ggml-ci
| // Some idiosyncrasy in task processing logic makes several trailing calls | ||
| // with empty content, we ignore these at the calee site. | ||
| if (content.empty()) { | ||
| return std::vector<json>({json::object()}); | ||
| } | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes #10694
| add_executable(${TARGET} ${TARGET_SRCS}) | ||
| install(TARGETS ${TARGET} RUNTIME) | ||
|  | ||
| # clean up generated files in pre-build step | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a note here, we should add a check in /scripts/xxd.cmake to see if the file need to be re-generated or not. I will do that in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. You mentioned that the /slots endpoint is also broken. I haven't looked at it yet. Maybe we can apply any additional fixes in this PR before merging? Feel free to push directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup I fixed it in 01da1ed
I also fix a problem with cpp wrapper llama_get_chat_template because it returns null terminator in the final json:

Co-authored-by: Georgi Gerganov <[email protected]>
* server : various fixes ggml-ci * server : show curent seed in slot_params ggml-ci * fix /slots endpoint * Update examples/server/server.cpp Co-authored-by: Georgi Gerganov <[email protected]> * server : reflect endpoint response changes in the readme ggml-ci --------- Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]>
* server : various fixes ggml-ci * server : show curent seed in slot_params ggml-ci * fix /slots endpoint * Update examples/server/server.cpp Co-authored-by: Georgi Gerganov <[email protected]> * server : reflect endpoint response changes in the readme ggml-ci --------- Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]>
Important
The
/slotsand/propsresponses have changed. See the updated READMEllama-serveron eachmaken_ctxfromslot_paramstoserver_slotserver_slot.to_json()