Skip to content

Conversation

@prwhelan
Copy link
Member

ServerSentEvent is now a record with event and data, rather than it being a record for value with a separate ServerSentEventField.

  • value was renamed to data
  • hasValue was renamed to hasData
  • Parsing was refactored to read more like its spec: writing to a buffer and flushing when we reach a blank newline
  • We now support multiline data payloads

ServerSentEvent is now a record with `event` and `data`, rather than
it being a record for value with a separate `ServerSentEventField`.

- `value` was renamed to `data`
- `hasValue` was renamed to `hasData`
- Parsing was refactored to read more like its spec: writing to a buffer
  and flushing when we reach a blank newline
- We now support multiline data payloads
@prwhelan prwhelan added >refactoring :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 labels Mar 31, 2025

public void testInfer_StreamRequest() throws Exception {
String responseJson = """
event: message_start
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tested Anthropic, I noticed they started sending event in the payload, so I added it to the test just in case it breaks (it doesn't)

@prwhelan prwhelan marked this pull request as ready for review April 1, 2025 12:10
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

var fieldStr = lineWithColon.substring(0, firstColon).toLowerCase(Locale.ROOT);

var value = lineWithColon.substring(firstColon + 1);
var trimmedValue = value.length() > 0 && value.charAt(0) == ' ' ? value.substring(1) : value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not String::trim() or String:: stripLeading()?
Is the idea to literally remove the first space char only

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's literally remove the first space char only:

Collect the characters on the line after the first U+003A COLON character (:), and let value be that string. If value starts with a U+0020 SPACE character, remove it from value.

https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation

Or at least I'm interpreting that as "if there are two or more spaces, only remove one space"

var payload = """
event: message
data: hello
data: there
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

*/
public class ServerSentEventParser {
private static final Pattern END_OF_LINE_REGEX = Pattern.compile("\\n|\\r|\\r\\n");
private static final Pattern END_OF_LINE_REGEX = Pattern.compile("\\r\\n|\\n|\\r");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the String class docs for a while and was pleased to find there is actually a method for splitting a string like this

https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html#lines()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I didn't know that existed. Let's use that, it does change the logic a bit though

@prwhelan prwhelan merged commit 69180ea into elastic:main Apr 3, 2025
18 checks passed
prwhelan added a commit to prwhelan/elasticsearch that referenced this pull request Apr 3, 2025
ServerSentEvent is now a record with `event` and `data`, rather than
it being a record for value with a separate `ServerSentEventField`.

- `value` was renamed to `data`
- `hasValue` was renamed to `hasData`
- Parsing was refactored to read more like its spec: writing to a buffer
  and flushing when we reach a blank newline
- We now support multiline data payloads
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

elasticsearchmachine pushed a commit that referenced this pull request Apr 3, 2025
ServerSentEvent is now a record with `event` and `data`, rather than
it being a record for value with a separate `ServerSentEventField`.

- `value` was renamed to `data`
- `hasValue` was renamed to `hasData`
- Parsing was refactored to read more like its spec: writing to a buffer
  and flushing when we reach a blank newline
- We now support multiline data payloads
andreidan pushed a commit to andreidan/elasticsearch that referenced this pull request Apr 9, 2025
ServerSentEvent is now a record with `event` and `data`, rather than
it being a record for value with a separate `ServerSentEventField`.

- `value` was renamed to `data`
- `hasValue` was renamed to `hasData`
- Parsing was refactored to read more like its spec: writing to a buffer
  and flushing when we reach a blank newline
- We now support multiline data payloads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :ml Machine learning >refactoring Team:ML Meta label for the ML team v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants