Skip to content

Conversation

@jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Feb 5, 2025

This PR refactors the unified schema approach. Prior to this PR the to interact with the unified schema a request is made to _inference/completion/<inference endpoint id>/_unified

This PR refactors the code such that we no longer have a _unified endpoint. Instead we'll use the previous way with _stream:

PUT _inference/chat_completion/some_endpoint_id/_stream

To do this I created a new proxy action that is registered for the inference rest APIs. The reason we need a proxy action is because we need to know whether the request is the new unified schema or the input task_settings schema. The proxy action determines this by looking to see if the task type was included in the URL. If it wasn't, we retrieve the inference entity from storage and retrieve that task type that way. Once we have the task type we create a new request and route it to InferenceAction or UnifiedInferenceAction.

Other Notes

The unified action does not support non-streaming so if a request is made like:

POST http://localhost:9200/_inference/chat_completion/test2

An error will be returned

Error
{
    "error": {
        "root_cause": [
            {
                "type": "status_exception",
                "reason": "The [chat_completion] task type only supports streaming, please try again with the _stream API"
            }
        ],
        "type": "status_exception",
        "reason": "The [chat_completion] task type only supports streaming, please try again with the _stream API"
    },
    "status": 400
}

The stats/metrics aren't recorded until the internal action is handled.

Testing

Creating the endpoint

PUT http://localhost:9200/_inference/test2
{
    "service": "openai",
    "task_type": "chat_completion",
    "service_settings": {
        "api_key": <api key>,
        "model_id": "gpt-4o"
    }
}

Making a unified request

POST http://localhost:9200/_inference/chat_completion/test2/_stream
POST http://localhost:9200/_inference/test2/_stream
{
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What's the price of a scarf?"
                }
            ]
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_current_price",
                "description": "Get the current price of a item",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "item": {
                            "id": "123"
                        }
                    }
                }
            }
        }
    ],
    "tool_choice": {
        "type": "function",
        "function": {
            "name": "get_current_price"
        }
    }
}

The response structure will be the same as it was before

@jonathan-buttner jonathan-buttner added >non-issue :ml Machine learning Team:ML Meta label for the ML team v9.0.0 v8.18.0 v8.18.1 v9.0.1 auto-backport Automatically create backport pull requests when merged labels Feb 5, 2025
@jonathan-buttner jonathan-buttner marked this pull request as ready for review February 6, 2025 13:26
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Left a comment about the internal action name everything else is great

*/
public class InferenceActionProxy extends ActionType<InferenceAction.Response> {
public static final InferenceActionProxy INSTANCE = new InferenceActionProxy();
public static final String NAME = "cluster:monitor/xpack/inference";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it is safe to use the old InferenceAction name can be used here when it has different request and response classes. I'm thinking about a mixed cluster. To be safe please give it a new name.

Get, Put and Delete are cluster:monitor/xpack/inference/[get|put|delete] this could be cluster:monitor/xpack/inference/post or cluster:monitor/xpack/inference/inference`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I'll switch to post 👍

…inference/action/InferenceActionProxy.java

Co-authored-by: David Kyle <[email protected]>
public class InferenceAction extends ActionType<InferenceAction.Response> {

public static final InferenceAction INSTANCE = new InferenceAction();
public static final String NAME = "cluster:monitor/xpack/inference";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidkyle just wanted to confirm that this is what we want to do here right? Changing it to internal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ yes internal is good


private void sendUnifiedCompletionRequest(InferenceActionProxy.Request request, ActionListener<InferenceAction.Response> listener) {
// format any validation exceptions from the rest -> transport path as UnifiedChatCompletionException
var unifiedErrorFormatListener = listener.delegateResponse((l, e) -> l.onFailure(UnifiedChatCompletionException.fromThrowable(e)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@jonathan-buttner jonathan-buttner merged commit ab48235 into elastic:main Feb 7, 2025
16 checks passed
@jonathan-buttner jonathan-buttner deleted the ml-proxy-action branch February 7, 2025 14:57
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.0 Commit could not be cherrypicked due to conflicts
8.18 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 121804

jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Feb 7, 2025
…ic#121804)

* Adding proxy action

* [CI] Auto commit changes from spotless

* Incrementing reference count for body content and fixing tests

* [CI] Auto commit changes from spotless

* Refactoring

* Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceActionProxy.java

Co-authored-by: David Kyle <[email protected]>

* Addressing feedback

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: David Kyle <[email protected]>
(cherry picked from commit ab48235)
jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Feb 7, 2025
…ic#121804)

* Adding proxy action

* [CI] Auto commit changes from spotless

* Incrementing reference count for body content and fixing tests

* [CI] Auto commit changes from spotless

* Refactoring

* Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceActionProxy.java

Co-authored-by: David Kyle <[email protected]>

* Addressing feedback

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: David Kyle <[email protected]>
(cherry picked from commit ab48235)

# Conflicts:
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rest/BaseInferenceActionTests.java
@jonathan-buttner
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x
9.0
8.18

Questions ?

Please refer to the Backport tool documentation

jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Feb 7, 2025
…ic#121804)

* Adding proxy action

* [CI] Auto commit changes from spotless

* Incrementing reference count for body content and fixing tests

* [CI] Auto commit changes from spotless

* Refactoring

* Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceActionProxy.java

Co-authored-by: David Kyle <[email protected]>

* Addressing feedback

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: David Kyle <[email protected]>
(cherry picked from commit ab48235)

# Conflicts:
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rest/BaseInferenceActionTests.java
elasticsearchmachine pushed a commit that referenced this pull request Feb 7, 2025
…) (#122042)

* Adding proxy action

* [CI] Auto commit changes from spotless

* Incrementing reference count for body content and fixing tests

* [CI] Auto commit changes from spotless

* Refactoring

* Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceActionProxy.java

Co-authored-by: David Kyle <[email protected]>

* Addressing feedback

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: David Kyle <[email protected]>
(cherry picked from commit ab48235)

# Conflicts:
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rest/BaseInferenceActionTests.java
elasticsearchmachine pushed a commit that referenced this pull request Feb 7, 2025
…) (#122040)

* Adding proxy action

* [CI] Auto commit changes from spotless

* Incrementing reference count for body content and fixing tests

* [CI] Auto commit changes from spotless

* Refactoring

* Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceActionProxy.java

Co-authored-by: David Kyle <[email protected]>

* Addressing feedback

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: David Kyle <[email protected]>
(cherry picked from commit ab48235)
elasticsearchmachine pushed a commit that referenced this pull request Feb 7, 2025
…) (#122045)

* Adding proxy action

* [CI] Auto commit changes from spotless

* Incrementing reference count for body content and fixing tests

* [CI] Auto commit changes from spotless

* Refactoring

* Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceActionProxy.java

Co-authored-by: David Kyle <[email protected]>

* Addressing feedback

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: David Kyle <[email protected]>
(cherry picked from commit ab48235)

# Conflicts:
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rest/BaseInferenceActionTests.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending :ml Machine learning >non-issue Team:ML Meta label for the ML team v8.18.0 v8.18.1 v8.19.0 v9.0.0 v9.0.1 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants