Skip to content

Conversation

@prwhelan
Copy link
Member

@prwhelan prwhelan commented May 6, 2025

SageMaker now supports Completion and Chat Completion using the OpenAI interfaces.

Additionally:

  • Fixed bug related to timeouts being nullable, default to 30s timeout
  • Exposed existing OpenAi request/response parsing logic for reuse

SageMaker now supports Completion and Chat Completion using the OpenAI
interfaces.

Additionally:
- Fixed bug related to timeouts being nullable, default to 30s timeout
- Exposed existing OpenAi request/response parsing logic for reuse
@prwhelan prwhelan added >enhancement :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 labels May 6, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @prwhelan, I've created a changelog YAML for you.

@prwhelan prwhelan marked this pull request as ready for review May 6, 2025 22:50
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple questions

} catch (IOException e) {
throw new ElasticsearchStatusException(
"Failed to parse event from inference provider: {}",
RestStatus.INTERNAL_SERVER_ERROR,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've talked about switching to use 502s, do you think that'd be appropriate here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so? Because this IOException is an error with our parsing logic, which may or may not mean there is something wrong with their response. It could be that we're out of date.

Map<String, Object> taskSettings,
InputType inputType,
TimeValue timeout,
@Nullable TimeValue timeout,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I thought the timeout is defaulted in the InferenceAction

Can it be null here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I believe I was hitting an issue when I was using curl, I think it can be null through this path:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be defaulting it there too I think:

    static TimeValue parseTimeout(RestRequest restRequest) {
        return restRequest.paramAsTime(InferenceAction.Request.TIMEOUT.getPreferredName(), InferenceAction.Request.DEFAULT_TIMEOUT);
    }

I think we should consider it a bug if it's null once it gets to the infer() calls. We should make sure it's defaulted prior to those calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


@Override
public TransportVersion getMinimalSupportedVersion() {
return TransportVersions.ML_INFERENCE_SAGEMAKER;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to create a new transport version right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, but I did anyway. In theory, since the name and parsing logic hadn't changed, both node versions should be able to parse the input/output. But in practice, I couldn't create a multi-node cluster with the same version (9.1.0) and different docker hashes, so I have no way to verify this assumption

}
]
}
""".replaceAll("\\s+", "").replaceAll("\\n+", "") + "\n\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would XContentHelper.stripWhitespace() work here?

@prwhelan prwhelan enabled auto-merge (squash) May 27, 2025 17:41
@prwhelan prwhelan removed the auto-backport Automatically create backport pull requests when merged label May 27, 2025
@prwhelan prwhelan merged commit 2830768 into elastic:main May 27, 2025
17 of 18 checks passed
prwhelan added a commit to prwhelan/elasticsearch that referenced this pull request May 27, 2025
SageMaker now supports Completion and Chat Completion using the OpenAI
interfaces.

Additionally:
- Fixed bug related to timeouts being nullable, default to 30s timeout
- Exposed existing OpenAi request/response parsing logic for reuse
elasticsearchmachine pushed a commit that referenced this pull request May 28, 2025
SageMaker now supports Completion and Chat Completion using the OpenAI
interfaces.

Additionally:
- Fixed bug related to timeouts being nullable, default to 30s timeout
- Exposed existing OpenAi request/response parsing logic for reuse
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :ml Machine learning Team:ML Meta label for the ML team v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants