How could I set the stop sequence for inference like in CodeLlama 70B? #666
davideuler
started this conversation in
General
Replies: 1 comment 6 replies
-
This is more of an application implementation issue than an mlx issue. I have implemented this in one of my projects. You can take a look here. |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When running inference with CodeLlama 70B, I need to specify the stop sequence in llama.cpp or in ollama.
When I run CodeLlama 70B 4bit MLX, it outputs lots of EOT and could not stop. I am not sure if it is caused by stop sequences settings. How could I set the stop sequence in MLX?
Beta Was this translation helpful? Give feedback.
All reactions