[ML] Inference API _services retrieves authorization information directly from EIS #134398

jonathan-buttner · 2025-09-09T20:13:46Z

This PR modifies the _inference/_services API to make a call directly to EIS to get the authorization information to determine the configuration information to return in the API call.

The authorization response will dictate whether we show EIS in the response and the task types that are included for the EIS configuration.

Follow up PRs will modify how authorization works in general but I'm going to try to do it in small chunks.

jonathan-buttner · 2025-09-09T20:14:59Z

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java

        components.add(httpClientManager);
        components.add(inferenceStatsBinding);
+        components.add(authorizationHandler);
+        components.add(new PluginComponentBinding<>(Sender.class, elasicInferenceServiceFactory.get().createSender()));


Without this I get a binding error when running

jonathan-buttner · 2025-09-09T20:15:36Z

...gin/src/main/java/org/elasticsearch/xpack/inference/mock/TestCompletionServiceExtension.java

                    );

                    return new InferenceServiceConfiguration.Builder().setService(NAME)
+                        .setName(NAME)


These aren't really necessary but I was running into a test failure because this field didn't exist until I switch the sorting field. Figured I'd leave the fix in though.

jonathan-buttner · 2025-09-09T20:25:20Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

    @Override
    public Set<TaskType> supportedStreamingTasks() {
-        return authorizationHandler.supportedStreamingTasks();
+        return EnumSet.of(TaskType.CHAT_COMPLETION);


Previously if the cluster wasn't authorized for chat completion we'd return a vague error about not being able to stream. With this change we'll allow the request to get sent to EIS and if it isn't authorized, EIS will return a failure.

…ner/elasticsearch into ml-eis-services-api-direct

…ervices-api-direct

elasticsearchmachine · 2025-09-10T15:33:07Z

Pinging @elastic/ml-core (Team:ML)

DonalEvans · 2025-09-12T21:52:15Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

+        // This shouldn't be called because the configuration changes based on the authorization
+        // Instead, retrieve the authorization directly from the EIS gateway and use the static method
+        // ElasticInferenceService.Configuration#createConfiguration() to create a configuration based on the authorization response


Could this be a javadoc comment instead, so that the method you're referencing stays correct even if it's modified?

davidkyle

LGTM

Some minor comments

davidkyle · 2025-09-17T13:39:21Z

...ce-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceGetServicesIT.java

+        super.setUp();
+        // Ensure the mock EIS server has an authorized response ready before each test because each test will
+        // use the services API which makes a call to EIS
+        mockEISServer.enqueueAuthorizeAllModelsResponse();


The init() method below also calls mockEISServer.enqueueAuthorizeAllModelsResponse(). Is it equivalent to change the annotation on that method to @Before?

Yeah the original issue for using @BeforeClass is because I ran into some weird issues locally. The original PR that added the @BeforeClass is here: #128640

For some background, when the node for the test starts up it will reach out to EIS and get the auth response. If that fails (there isn't a response queued in the mock server), then the tests will fail. What I was observing is that the base classes static logic would only be executed once regardless of how many subclasses used the base. This resulted in the first test class succeeding but the second test class that leveraged the base would fail. To get around this I added the @BeforeClass and it seemed to fix the issue. The reason we need this in @BeforeClass is because we need a response queued before Elasticsearch is started. Elasticsearch is started only once at the beginning before all the tests run.

davidkyle · 2025-09-18T08:49:02Z

.../main/java/org/elasticsearch/xpack/inference/action/TransportGetInferenceServicesAction.java

+                }
+            );
+
+            getServiceConfigurationsForServices(availableServices, mergeEisConfigListener);


getServiceConfigurationsForServices() is a synchronous method and could be written with a return type instead of a listener. I think that would make this code easier to read as you wouldn't need to define the merge listener

}).<List<InferenceServiceConfiguration>>andThen((configurationListener, authorizationModel) -> { var serviceConfigs = getServiceConfigurationsForServices(availableServices); if (authorizationModel.isAuthorized() == false) { delegate.onResponse(serviceConfigs); return; } if (requestedTaskType != null && authorizationModel.getAuthorizedTaskTypes().contains(requestedTaskType) == false) { delegate.onResponse(serviceConfigs); return; } var config = ElasticInferenceService.createConfiguration(authorizationModel.getAuthorizedTaskTypes()); serviceConfigs.add(config); serviceConfigs.sort(Comparator.comparing(InferenceServiceConfiguration::getService)); delegate.onResponse(serviceConfigs); }

🤦‍♂️ thank you, for some reason I thought it needed use a listener.

davidkyle · 2025-09-18T08:56:50Z

.../main/java/org/elasticsearch/xpack/inference/action/TransportGetInferenceServicesAction.java

            .stream()
            .filter(
-                service -> service.getValue().hideFromConfigurationApi() == false
+                // exclude EIS here because the hideFromConfigurationApi() is not supported


I was slightly confused about "hideFromConfigurationApi() is not supported" in this comment

Suggested change

// exclude EIS here because the hideFromConfigurationApi() is not supported

// Exclude EIS as the EIS specific configurations are handled separately

davidkyle · 2025-09-18T09:12:56Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

+        configurationMap.put(
+            MODEL_ID,
+            new SettingsConfiguration.Builder(
+                EnumSet.of(TaskType.SPARSE_EMBEDDING, TaskType.CHAT_COMPLETION, TaskType.RERANK, TaskType.TEXT_EMBEDDING)


Should this the intersection of enabledTaskTypes and the full set?

EnumSet.of(TaskType.SPARSE_EMBEDDING, TaskType.CHAT_COMPLETION, TaskType.RERANK, TaskType.TEXT_EMBEDDING).retainAll(enabledTaskTypes); ``

The list of task types here tells the UI for which task types this field should be configurable. That should stay the same regardless of whether the user is authorized for a specific task type. There's a top level field for task types that indicate which ones are authorized and that's set here:

return new InferenceServiceConfiguration.Builder().setService(NAME) .setName(SERVICE_NAME) .setTaskTypes(enabledTaskTypes) <-------- .setConfigurations(configurationMap) .build();

…ervices-api-direct

get services querying eis gateway for info

89ef432

jonathan-buttner added >non-issue :ml Machine learning Team:ML Meta label for the ML team v9.2.0 labels Sep 9, 2025

[CI] Auto commit changes from spotless

ef6e1c9

jonathan-buttner commented Sep 9, 2025

View reviewed changes

jonathan-buttner added 4 commits September 9, 2025 16:25

Adding fixes

5bf5428

Merge branch 'ml-eis-services-api-direct' of github.com:jonathan-butt…

cecbc82

…ner/elasticsearch into ml-eis-services-api-direct

Fixing tests

1d1adb8

Merge branch 'main' of github.com:elastic/elasticsearch into ml-eis-s…

dfba0f9

…ervices-api-direct

jonathan-buttner marked this pull request as ready for review September 10, 2025 15:32

DonalEvans reviewed Sep 12, 2025

View reviewed changes

jonathan-buttner and others added 2 commits September 15, 2025 16:35

Address feedback for javadoc

92e982c

Merge branch 'main' into ml-eis-services-api-direct

b86a5e1

DonalEvans approved these changes Sep 16, 2025

View reviewed changes

davidkyle approved these changes Sep 18, 2025

View reviewed changes

jonathan-buttner added 2 commits September 18, 2025 14:26

Addressing feedback

30cf241

Merge branch 'main' of github.com:elastic/elasticsearch into ml-eis-s…

17b54a3

…ervices-api-direct

jonathan-buttner enabled auto-merge (squash) September 18, 2025 20:14

Merge branch 'main' into ml-eis-services-api-direct

6583ca4

jonathan-buttner merged commit aae1ffc into elastic:main Sep 18, 2025
34 checks passed

jonathan-buttner deleted the ml-eis-services-api-direct branch September 19, 2025 14:39

ioanatia mentioned this pull request Oct 15, 2025

Default semantic_text fields to ELSER on EIS when available #134708

Merged

	// exclude EIS here because the hideFromConfigurationApi() is not supported
	// Exclude EIS as the EIS specific configurations are handled separately

[ML] Inference API _services retrieves authorization information directly from EIS #134398

[ML] Inference API _services retrieves authorization information directly from EIS #134398

Conversation

jonathan-buttner commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Sep 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidkyle left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jonathan-buttner commented Sep 9, 2025 •

edited

Loading