-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Added --chat-template-file to llama-run #11922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added --chat-template-file to llama-run #11922
Conversation
|
@ericcurtin PTAL |
36a0c4f to
2ec2107
Compare
2ec2107 to
6e50443
Compare
|
This code LGTM waiting on builds |
Should pass now. Didn't review the latest changes from master thoroughly. Not sure why I get |
examples/run/run.cpp
Outdated
| return ""; | ||
| } | ||
|
|
||
| FILE* file = ggml_fopen(chat_template_file.c_str(), "r"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better if it used the File class from this code. It would automatically fclose the file when necessary. I would also change fopen -> ggml_fopen in that File class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file class from this code is behind the LLAMA_USE_CURL macro. So I'd need to move it since the --chat-template should also work without that macro. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'd move it out of the LLAMA_USE_CURL macro. It doesn't actually do any curl stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. PTAL @ericcurtin
6e50443 to
f01a139
Compare
examples/run/run.cpp
Outdated
| printe("Error reading chat template file '%s': %s", chat_template_file.c_str(), strerror(errno)); | ||
| return ""; | ||
| } | ||
| return std::string(data.begin(), data.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could actually fread directly to a std::string and save a copy. It's a optimisation, a small one. A std::string is basically a vector of chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
Never seen that before 🤷 |
Relates to: ggml-org#11178 Added --chat-template-file CLI option to llama-run. If specified, the file will be read and the content passed for overwriting the chat template of the model to common_chat_templates_from_model. Signed-off-by: Michael Engel <[email protected]>
f01a139 to
86f68a8
Compare
Hmm... closing and reopening it. Hopefully this fixes that error. |
Relates to: #11178
Added --chat-template-file CLI option to llama-run. If specified, the file will be read and the content passed for overwriting the chat template of the model to common_chat_templates_from_model.
This also enables running the
granite-codemodel from ollama:Make sure to read the contributing guidelines before submitting a PR