-
Notifications
You must be signed in to change notification settings - Fork 38
Description
An edge case was discovered with the completions endpoint where the chunker returns a 500 error due to an internal "StopIteration" error. Upon initial investigation, we determined that this happens in an unexpected scenario: the model generates no text for the prompt, so the first completion chunk is the stop message, i.e. it includes finish_reason and no choice text.
As no text is sent to the chunker in this scenario, the bidirectional chunker stream is closed without ever being used, which presumably triggers the chunker “StopIteration” error (there weren’t any messages to iterate on and process). The orchestrator handles this as a “standard” 500 error and passes it through to the client.
To resolve this, proper handling needs to be implemented for this scenario. If the model generates no text, i.e. the first completion chunk is the stop message, the detection pipeline tasks should be shut down and the stop message and subsequent usage message should be passed through directly to the client. This scenario also needs to be handled for chat completions.