Skip to content

Conversation

@prwhelan
Copy link
Member

Backport #127658

Also backported DequeUtils since this backport depends on it.
Had to modify some of the code since we have branched since 8.19/9.1

Moved the Chat Completion buffer into the
StreamingUnifiedChatCompletionResults so that all Chat Completion
responses can benefit from it. Chat Completions is meant to adhere to
OpenAI as much as possible, and OpenAI only sends one response chunk at
a time. All implementations of Chat Completions will now buffer.

This fixes a bug where more than two chunks in a single item would be
dropped, instead they are all added to the buffer.

This fixes a bug where onComplete would omit trailing items in the
buffer.
@prwhelan prwhelan added >bug :ml Machine learning backport Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.18.2 v9.0.2 labels May 19, 2025
@prwhelan prwhelan marked this pull request as ready for review May 19, 2025 15:14
@prwhelan prwhelan merged commit 0c57671 into elastic:9.0 May 19, 2025
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.18 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 128134

prwhelan added a commit to prwhelan/elasticsearch that referenced this pull request May 19, 2025
…tic#128134)

Moved the Chat Completion buffer into the
StreamingUnifiedChatCompletionResults so that all Chat Completion
responses can benefit from it. Chat Completions is meant to adhere to
OpenAI as much as possible, and OpenAI only sends one response chunk at
a time. All implementations of Chat Completions will now buffer.

This fixes a bug where more than two chunks in a single item would be
dropped, instead they are all added to the buffer.

This fixes a bug where onComplete would omit trailing items in the
buffer.
elasticsearchmachine pushed a commit that referenced this pull request May 20, 2025
…128164)

Moved the Chat Completion buffer into the
StreamingUnifiedChatCompletionResults so that all Chat Completion
responses can benefit from it. Chat Completions is meant to adhere to
OpenAI as much as possible, and OpenAI only sends one response chunk at
a time. All implementations of Chat Completions will now buffer.

This fixes a bug where more than two chunks in a single item would be
dropped, instead they are all added to the buffer.

This fixes a bug where onComplete would omit trailing items in the
buffer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending backport >bug :ml Machine learning Team:ML Meta label for the ML team v8.18.2 v9.0.2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants