Replies: 3 comments
-
@fblissjr Perhaps you can try it. https://github.com/madroidmaq/mlx-omni-server |
Beta Was this translation helpful? Give feedback.
0 replies
-
I would appreciate such an development! |
Beta Was this translation helpful? Give feedback.
0 replies
-
Support with #1299 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Been thinking a bit about an elegant way to allow for modifying system messages / system instructions in the generate functions for instruct/chat models, as well as pre-fill - essentially creating a message chain of:
System Role -> User Role -> Assistant Role -> n
. This is often a very underused but powerful way to work with single step instruct models (we do this often with Anthropic's pre-fill features in the API)Here's an example of both:
Adding only
--system-message
arg:Adding messages-file for pre-fill:
For system role, I've been doing this for quite some time in my local repo for mlx-lm, and think this is probably the simplest approach:
generate.py
:messages
building to support it:However, to support the multi-turn pre-fill - - the best I can come up with is a messages JSON file, which could also work fantastically with the cache capabilities. Here's my initial proposed approach, but am open to simpler ones if they exist:
Welcome any thoughts here, or if I should just open a PR to move discussion there?
Beta Was this translation helpful? Give feedback.
All reactions