Skip to content
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
ef50c43
Add Google Model Garden Anthropic integration
Jan-Kazlouski-elastic Sep 3, 2025
1de1067
Clean up AnthropicChatCompletionStreamingProcessor
Jan-Kazlouski-elastic Sep 3, 2025
ae4576c
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 5, 2025
ea9f933
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 8, 2025
9f61342
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 9, 2025
c3ce521
Enhance GoogleVertexAiChatCompletionServiceSettings to support option…
Jan-Kazlouski-elastic Sep 9, 2025
fc136a5
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 9, 2025
4e6ccee
Add extractOptionalUri method and corresponding tests for URI extraction
Jan-Kazlouski-elastic Sep 9, 2025
07eed54
Add GoogleModelGardenProvider support to chat completion models and t…
Jan-Kazlouski-elastic Sep 9, 2025
1adf958
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 10, 2025
a4ad1c5
Enhance AnthropicChatCompletionStreamingProcessor and related classes…
Jan-Kazlouski-elastic Sep 10, 2025
b143661
Refactor AnthropicChatCompletionResponseHandler to use a custom error…
Jan-Kazlouski-elastic Sep 10, 2025
0712434
Add unit tests for AnthropicChatCompletionStreamingProcessor to valid…
Jan-Kazlouski-elastic Sep 11, 2025
fa44bd4
Add unit tests for GoogleModelGardenAnthropicChatCompletionRequestEnt…
Jan-Kazlouski-elastic Sep 11, 2025
8a1e710
Add support for Anthropic provider in Google Vertex AI chat completio…
Jan-Kazlouski-elastic Sep 11, 2025
ff63315
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 11, 2025
93cb5ad
Add changelog
Jan-Kazlouski-elastic Sep 11, 2025
54ff2ff
Refactor switch case in GoogleVertexAiActionCreator to handle null case
Jan-Kazlouski-elastic Sep 11, 2025
67d5d8a
Validate service settings for Google Vertex AI model configuration
Jan-Kazlouski-elastic Sep 11, 2025
a4540d3
Enhance Anthropic model tests to validate URI handling and provider r…
Jan-Kazlouski-elastic Sep 11, 2025
4d1a6fa
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 12, 2025
8f60605
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 15, 2025
f4be58b
[CI] Auto commit changes from spotless
Sep 15, 2025
d926c71
Refactor switch case in GoogleVertexAiService to handle null case
Jan-Kazlouski-elastic Sep 16, 2025
954c423
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 16, 2025
fbb0cf7
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 18, 2025
386eedf
Simplify version check in GoogleVertexAiChatCompletionServiceSettings
Jan-Kazlouski-elastic Sep 18, 2025
32b44b2
Make GOOGLE provider default for GoogleModelGarden integration
Jan-Kazlouski-elastic Sep 18, 2025
63ea4b8
Update anthropic_version to vertex-2024-10-22 in request entity and t…
Jan-Kazlouski-elastic Sep 18, 2025
76b27a5
Refactor Google Vertex AI request handling to improve provider manage…
Jan-Kazlouski-elastic Sep 18, 2025
eb4c328
Enhance validation for Google Model Garden settings to ensure require…
Jan-Kazlouski-elastic Sep 18, 2025
c61b069
Remove uri streamingUri and provider from rate limit grouping hash ca…
Jan-Kazlouski-elastic Sep 18, 2025
c9631fb
Refactor null and empty checks for projectId, location, and modelId i…
Jan-Kazlouski-elastic Sep 18, 2025
4072327
Refactor Google Model Garden integration to include task settings in …
Jan-Kazlouski-elastic Sep 18, 2025
5e5c985
Revert "Update anthropic_version to vertex-2024-10-22 in request enti…
Jan-Kazlouski-elastic Sep 19, 2025
6987c00
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 20, 2025
001670c
Refactor Google Vertex AI settings to utilize GoogleVertexAiUtils for…
Jan-Kazlouski-elastic Sep 20, 2025
1d27867
[CI] Update transport version definitions
Sep 20, 2025
7c6940b
Update anthropic_version in tests and enhance validation logic for Go…
Jan-Kazlouski-elastic Sep 20, 2025
c14b8c6
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 22, 2025
b715a83
Update versions
Jan-Kazlouski-elastic Sep 22, 2025
01c7348
Enhance task settings validation in GoogleVertexAiChatCompletionModel
Jan-Kazlouski-elastic Sep 22, 2025
af5c93a
Address comments regarding anthropic version and configuration
Jan-Kazlouski-elastic Sep 22, 2025
9fe4bdc
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 23, 2025
398992d
[CI] Update transport version definitions
Sep 23, 2025
d0d2e57
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 24, 2025
bd43f4f
Add nullable annotation for maxTokens parameter in GoogleVertexAiChat…
Jan-Kazlouski-elastic Sep 24, 2025
df27cc0
[CI] Update transport version definitions
Sep 24, 2025
b1823ee
Clarify URI handling logic in GoogleVertexAiChatCompletionModel comments
Jan-Kazlouski-elastic Sep 24, 2025
9fd5e03
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 25, 2025
370c6dc
Make maxTokens nullable
Jan-Kazlouski-elastic Sep 25, 2025
87c1464
[CI] Update transport version definitions
Sep 25, 2025
3e498ce
Fixed unit tests
Jan-Kazlouski-elastic Sep 25, 2025
066b396
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 26, 2025
617c090
[CI] Update transport version definitions
Sep 26, 2025
ffc8e4f
Fix validation logic for Google Model Garden and Vertex AI settings
Jan-Kazlouski-elastic Sep 26, 2025
ac47f22
[CI] Update transport version definitions
Sep 26, 2025
e47e311
Add validation tests for Google Vertex AI and Model Garden settings
Jan-Kazlouski-elastic Sep 26, 2025
2852695
Refactor validation logic for Google Vertex AI and Model Garden settings
Jan-Kazlouski-elastic Sep 26, 2025
23b86ec
Add comment
Jan-Kazlouski-elastic Sep 26, 2025
d1b63bf
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 29, 2025
89a744e
Update Google Vertex AI Task Settings parsing logic and AnthropicChat…
Jan-Kazlouski-elastic Sep 29, 2025
2847a1f
Merge remote-tracking branch 'origin/main' into google-model-garden-i…
Jan-Kazlouski-elastic Sep 29, 2025
55da91c
Merge branch 'main' into google-model-garden-integration
jonathan-buttner Sep 29, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/134080.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 134080
summary: Added Google Model Garden Anthropic Completion and Chat Completion support to the Inference Plugin
area: Machine Learning
type: enhancement
issues: []
2 changes: 2 additions & 0 deletions server/src/main/java/org/elasticsearch/TransportVersions.java
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,7 @@ static TransportVersion def(int id) {
public static final TransportVersion ESQL_DOCUMENTS_FOUND_AND_VALUES_LOADED_8_19 = def(8_841_0_61);
public static final TransportVersion ESQL_PROFILE_INCLUDE_PLAN_8_19 = def(8_841_0_62);
public static final TransportVersion INITIAL_ELASTICSEARCH_8_19_4 = def(8_841_0_68);
public static final TransportVersion ML_INFERENCE_GOOGLE_MODEL_GARDEN_ADDED_8_19 = def(8_841_0_69);
Copy link
Contributor Author

@Jan-Kazlouski-elastic Jan-Kazlouski-elastic Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this needs to be removed. I haven't seen backports in a while. But Google Vertex AI is there for quite some time, so probably we'd require one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this, we won't be backporting the changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

public static final TransportVersion INITIAL_ELASTICSEARCH_9_0 = def(9_000_0_00);
public static final TransportVersion REMOVE_SNAPSHOT_FAILURES_90 = def(9_000_0_01);
public static final TransportVersion TRANSPORT_STATS_HANDLING_TIME_REQUIRED_90 = def(9_000_0_02);
Expand Down Expand Up @@ -327,6 +328,7 @@ static TransportVersion def(int id) {
public static final TransportVersion ML_INFERENCE_IBM_WATSONX_COMPLETION_ADDED = def(9_115_0_00);
public static final TransportVersion INFERENCE_API_EIS_DIAGNOSTICS = def(9_156_0_00);
public static final TransportVersion ML_INFERENCE_ENDPOINT_CACHE = def(9_157_0_00);
public static final TransportVersion ML_INFERENCE_GOOGLE_MODEL_GARDEN_ADDED = def(9_158_0_00);

/*
* STOP! READ THIS FIRST! No, really,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,11 +58,11 @@ public record UnifiedCompletionRequest(
private static final String ROLE_FIELD = "role";
private static final String CONTENT_FIELD = "content";
private static final String STOP_FIELD = "stop";
private static final String TEMPERATURE_FIELD = "temperature";
private static final String TOOL_CHOICE_FIELD = "tool_choice";
private static final String TOOL_FIELD = "tools";
public static final String TEMPERATURE_FIELD = "temperature";
public static final String TOOL_CHOICE_FIELD = "tool_choice";
public static final String TOOL_FIELD = "tools";
private static final String TEXT_FIELD = "text";
private static final String TYPE_FIELD = "type";
public static final String TYPE_FIELD = "type";
private static final String MODEL_FIELD = "model";
private static final String MAX_COMPLETION_TOKENS_FIELD = "max_completion_tokens";
private static final String MAX_TOKENS_FIELD = "max_tokens";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,18 @@ public static URI extractUri(Map<String, Object> map, String fieldName, Validati
return convertToUri(parsedUrl, fieldName, ModelConfigurations.SERVICE_SETTINGS, validationException);
}

/**
* Extracts an optional URI from the map. If the field is not present, null is returned. If the field is present but invalid,
* @param map the map to extract the URI from
* @param fieldName the field name to extract
* @param validationException the validation exception to add errors to
* @return the extracted URI or null if not present
*/
public static URI extractOptionalUri(Map<String, Object> map, String fieldName, ValidationException validationException) {
String parsedUrl = extractOptionalString(map, fieldName, ModelConfigurations.SERVICE_SETTINGS, validationException);
return convertToUri(parsedUrl, fieldName, ModelConfigurations.SERVICE_SETTINGS, validationException);
}

public static URI convertToUri(@Nullable String url, String settingName, String settingScope, ValidationException validationException) {
try {
return createOptionalUri(url);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

package org.elasticsearch.xpack.inference.services.anthropic;

import org.elasticsearch.inference.InferenceServiceResults;
import org.elasticsearch.xpack.core.inference.results.StreamingUnifiedChatCompletionResults;
import org.elasticsearch.xpack.core.inference.results.UnifiedChatCompletionException;
import org.elasticsearch.xpack.inference.external.http.HttpResult;
import org.elasticsearch.xpack.inference.external.http.retry.ChatCompletionErrorResponseHandler;
import org.elasticsearch.xpack.inference.external.http.retry.ResponseParser;
import org.elasticsearch.xpack.inference.external.http.retry.UnifiedChatCompletionErrorParserContract;
import org.elasticsearch.xpack.inference.external.http.retry.UnifiedChatCompletionErrorResponseUtils;
import org.elasticsearch.xpack.inference.external.request.Request;
import org.elasticsearch.xpack.inference.external.response.streaming.ServerSentEventParser;
import org.elasticsearch.xpack.inference.external.response.streaming.ServerSentEventProcessor;
import org.elasticsearch.xpack.inference.services.anthropic.response.AnthropicChatCompletionResponseEntity;

import java.util.concurrent.Flow;

/**
* Handles streaming chat completion responses and error parsing for Anthropic inference endpoints.
* Adapts the AnthropicResponseHandler to support chat completion schema.
*/
public class AnthropicChatCompletionResponseHandler extends AnthropicResponseHandler {
private static final String ANTHROPIC_ERROR = "anthropic_error";
private static final UnifiedChatCompletionErrorParserContract ANTHROPIC_ERROR_PARSER = UnifiedChatCompletionErrorResponseUtils
.createErrorParserWithStringify(ANTHROPIC_ERROR);

private final ChatCompletionErrorResponseHandler chatCompletionErrorResponseHandler;

public AnthropicChatCompletionResponseHandler(String requestType) {
this(requestType, AnthropicChatCompletionResponseEntity::fromResponse);
}

private AnthropicChatCompletionResponseHandler(String requestType, ResponseParser parseFunction) {
super(requestType, parseFunction, true);
this.chatCompletionErrorResponseHandler = new ChatCompletionErrorResponseHandler(ANTHROPIC_ERROR_PARSER);
}

@Override
public InferenceServiceResults parseResult(Request request, Flow.Publisher<HttpResult> flow) {
var serverSentEventProcessor = new ServerSentEventProcessor(new ServerSentEventParser());
var anthropicProcessor = new AnthropicChatCompletionStreamingProcessor(
(m, e) -> chatCompletionErrorResponseHandler.buildMidStreamChatCompletionError(request.getInferenceEntityId(), m, e)
);
flow.subscribe(serverSentEventProcessor);
serverSentEventProcessor.subscribe(anthropicProcessor);
return new StreamingUnifiedChatCompletionResults(anthropicProcessor);
}

@Override
protected UnifiedChatCompletionException buildError(String message, Request request, HttpResult result) {
return chatCompletionErrorResponseHandler.buildChatCompletionError(message, request, result);
}
}
Loading