Skip to content

Conversation

matteoserva
Copy link
Contributor

@matteoserva matteoserva commented Aug 1, 2025

In the /completion endpoint the server correctly holds the partial stop string until a decision can be made.

If EOG is reached and a stop string was not found, then the generated content should be flushed to the user.

To reproduce:
query: Repeat exactly the following text: UNO DUE TRE
stop: ["TRE\nAAAAAAAA"]
stream: True

Previous result:
UNO DUE

After this PR:
UNO DUE TRE

Note:
The openAI compatible api has the opposite problem, the partial stop string is always flushed even if the full stop string is found.
This is not addressed in this PR

@matteoserva matteoserva requested a review from ngxson as a code owner August 1, 2025 08:29
@matteoserva matteoserva changed the title flush partial stop string when <EOG> is reached in /completion endpoint in streaming mode Fix: flush partial stop string when <EOG> is reached in /completion endpoint in streaming mode Aug 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant