No text generation scenario not properly handled for completions streaming

An edge case was discovered with the completions endpoint where the chunker returns a 500 error due to an internal "StopIteration" error. Upon initial investigation, we determined that this happens in an unexpected scenario: the model generates _no_ text for the prompt, so the first completion chunk is the stop message, i.e. it includes `finish_reason` and no choice text.

As no text is sent to the chunker in this scenario, the bidirectional chunker stream is closed without ever being used, which presumably triggers the chunker “StopIteration” error (there weren’t any messages to iterate on and process). The orchestrator handles this as a “standard” 500 error and passes it through to the client.

To resolve this, proper handling needs to be implemented for this scenario. If the model generates no text, i.e. the first completion chunk is the stop message, the detection pipeline tasks should be shut down and the stop message and subsequent usage message should be passed through directly to the client. This scenario also needs to be handled for chat completions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No text generation scenario not properly handled for completions streaming #518

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

No text generation scenario not properly handled for completions streaming #518

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions