Replies: 1 comment
-
So we can definitely fix this one case. In general keeping the server in a valid state when we get invalid requests is a good idea but it will require a bit of vigilance / handling issues on a case-by-case basis. I'll send a fix for this one case, but if you notice other sequences which put it in an irrecoverable state let us know and we can fix them as well. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
It seems that currently, if there is any sort of error that occurs in the runtime of the server, then you have to restart the server.
For example, if I send an invalid request without specifying the model name, then I would run into a connection error or a NotFoundError, but then if I send a request with the correct model name again, then I would encounter the following error:
mlx_lm/server.py", line 557, in handle_chat_completions
prompt = self.tokenizer.encode(prompt)
AttributeError: 'NoneType' object has no attribute 'encode'
Currently, the only way to solve this is to restart the mlx-lm server, which is rather inconvenient.
It might be great if the server could continue to work if the valid request is sent, even after it is sent after an invalid request.
I am not sure if this behavioral design is intentional, but I would like to bring this up for an open discussion.
Beta Was this translation helpful? Give feedback.
All reactions