Vertexai chatcompletion #1

lhoet-google · 2025-04-30T13:09:45Z

Keeping this PR so I can track the progress and see the code. It's still in early development so a lot of things can change.

Once it's finished it will be squashed in one commit and submitted as a PR to the main elasticsearch repo

x-pack/plugin/inference/build.gradle

prwhelan · 2025-04-30T18:54:37Z

...erence/services/googlevertexai/request/GoogleVertexAiUnifiedChatCompletionRequestEntity.java

+            return messageRoleLowered;
+        }
+
+        // TODO: Here is OK to throw an IOException?


Might be better as an ElasticsearchStatusException with RestStatus.BAD_REQUEST since it is an unsupported configuration that the user has to take action on. Preferably, this is validated within GoogleVertexAiService but I'm okay with it being this late in the call chain as well

prwhelan · 2025-04-30T18:58:05Z

...erence/services/googlevertexai/request/GoogleVertexAiUnifiedChatCompletionRequestEntity.java

+            builder.field(ROLE, messageRoleToGoogleVertexAiSupportedRole(message.role()));
+            builder.startArray(PARTS);
+            builder.startObject();
+            builder.field(TEXT, message.content().toString());


I'm not sure the toString() will work, message.content() can return one of a few different objects, example: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/unified/UnifiedChatCompletionRequestEntity.java#L61

…ith streaming

…d tools choice with tests

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/GoogleVertexAiServiceTests.java

Implemented basic unit testing. Will improve in the next commit. As of now, we want to find a way to mock certain parts of the initialization of the Google VertexAI service that trigger the authorization decorator, without using tools like powermock or changing too much the code.

Implemented a test case for persisted config with secrets.

Unit tests

…sticsearch into vertexai-chatcompletion

prwhelan · 2025-05-19T18:26:57Z

...ck/inference/services/googlevertexai/GoogleVertexAiUnifiedChatCompletionResponseHandler.java

+    public InferenceServiceResults parseResult(Request request, Flow.Publisher<HttpResult> flow) {
+        assert request.isStreaming() : "GoogleVertexAiUnifiedChatCompletionResponseHandler only supports streaming requests";
+
+        var serverSentEventProcessor = new JsonArrayPartsEventProcessor(new JsonArrayPartsEventParser());


I think if we send the request with the alt=sse query param, the API will respond in SSE, and then we can reuse the existing ServerSentEventProcessor: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/googleaistudio/GoogleAiStudioResponseHandler.java#L90

That is at least what happens when I test the API with curl. We'd then have less code to maintain. JsonArrayPartsEventParser is cleverly written though well done.

prwhelan · 2025-05-19T18:31:40Z

...icsearch/xpack/inference/services/googlevertexai/GoogleVertexAiCompletionRequestManager.java

+        ActionListener<InferenceServiceResults> listener
+    ) {
+
+        var chatInputs = (UnifiedChatInput) inferenceInputs;


Suggested change

var chatInputs = (UnifiedChatInput) inferenceInputs;

var chatInputs = inferenceInputs.castTo(UnifiedChatInput.class);

If the types are somehow wrong, this will throw a decorated IllegalArgumentException rather than the ClassCastException

prwhelan · 2025-05-19T18:38:33Z

...csearch/xpack/inference/services/googlevertexai/GoogleVertexAiUnifiedStreamingProcessor.java

+    private static final String FUNCTION_TYPE = "function";
+
+    private final BiFunction<String, Exception, Exception> errorParser;
+    private final Deque<StreamingUnifiedChatCompletionResults.ChatCompletionChunk> buffer = new LinkedBlockingDeque<>();


We can delete the buffer code in this class - StreamingUnifiedChatCompletionResults now has a buffer internally (so we don't have to copy/paste the buffer code everywhere): elastic@b108e39

prwhelan · 2025-05-19T18:43:58Z

...st/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceServiceTests.java

        }
    }

+    public void testUnifiedCompletionInfer_WithGoogleVertexAiModel() throws IOException {


This should actually go in GoogleVertexAiServiceTests

prwhelan · 2025-05-19T18:51:01Z

...nference/services/googlevertexai/completion/GoogleVertexAiChatCompletionServiceSettings.java

+
+    @Override
+    public String getWriteableName() {
+        return NAME;


This should be registered in InferenceNamedWriteablesProvider. We don't have any tests to verify this explicitly, so it's hard to know/verify, but it'll come up in multi-node clusters if one node calls another node to call Vertex AI.

We just added a test case to help verify this, if you want, that you can extend:

InferenceSettingsTestCase

Example

lhoet-google · 2025-05-19T19:37:04Z

@prwhelan thanks for all the feedback! We squashed all commits into a single one and made another PR here: elastic#128105 . Will work on your comments on that PR

lhoet-google · 2025-05-30T14:19:06Z

Closing, this feature has been merged here elastic#128105

lhoet-google added 4 commits April 29, 2025 14:01

VertexAI chat completion response entity with tests

d429a31

Modified build gradle to include google vertexai sdk

00bfdb0

Google vertex ai chat completion model with tests

2378270

Google vertex ai chat completion request with tests

1f00974

lhoet-google commented Apr 30, 2025

View reviewed changes

x-pack/plugin/inference/build.gradle Outdated Show resolved Hide resolved

prwhelan reviewed Apr 30, 2025

View reviewed changes

lhoet-google added 24 commits April 30, 2025 16:50

TransportVersion

970ab3c

ChatCompletion TaskSettings & ServiceSettings

5428074

ChatCompletionRequestManager & tests

ee44f22

VertexAI Service and related classes. WIP & missing tests

8160c2b

VertexAi ChatCompletion task settings fix.

ff68fbe

JsonArrayParts event processor & parser

29c7093

AI Service and service tests

bfd75b0

Unified chat completion response and request handlers. Also working w…

2ebfac9

…ith streaming

StreamingProcessor now support tools. Added more tests

679ea80

More tests for streaming processor

e611cc3

Request entity tests

87e428a

Google vertexai unified chat completion entity now accepting tools an…

193d06d

…d tools choice with tests

Serializing function call message

813a2e8

Response handler with tests

f1ab8cc

VertexAI chat completion req entity bugfixes

23c7d92

Bugfix in vertex ai unified chat completion req entity

c45d23f

Bugfix in vertex ai unified streaming processor

a820d83

Removed google aiplatform sdk

d2f09cf

Renamed file to match class name for JsonArrayPartsEventParser

bda94de

Updated rate limit settings for vertex ai

5dee072

Deleted GoogleVertexAiChatCompletionTaskSettings

2f75788

VertexAI Unified chat completion request tests

b50c911

Fixed some tests

d6ae90f

Fixed GoogleAIService get configuration tests

cbb387f

lhoet-google and others added 14 commits May 14, 2025 12:22

GoogleVertexAiCompletion action tests

7e1c970

Formatting

5ab716f

Code style fix

28aa464

Removed unnused variables

2279391

Function call id fixed

85af5c0

Bugfix

16c01b0

Merge branch 'main' into vertexai-chatcompletion

1732244

# Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googlevertexai/GoogleVertexAiServiceTests.java

Testfix

6cc165b

Merge branch 'vertexai-chatcompletion' into google-chat-completion-tests

7821d58

Update ElasticInferenceServiceTests.java

8633659

Update GoogleVertexAiServiceTests.java

06020cc

Implemented a test case for persisted config with secrets.

Merge pull request #2 from beltrangslilly/google-chat-completion-tests

5a2cfe5

Unit tests

Merge branch 'vertexai-chatcompletion' of github.com:lhoet-google/ela…

0cf1f3f

…sticsearch into vertexai-chatcompletion

prwhelan reviewed May 19, 2025

View reviewed changes

lhoet-google closed this May 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vertexai chatcompletion #1

Vertexai chatcompletion #1

Uh oh!

lhoet-google commented Apr 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

prwhelan Apr 30, 2025

Uh oh!

prwhelan Apr 30, 2025

Uh oh!

prwhelan May 19, 2025

Uh oh!

prwhelan May 19, 2025

Uh oh!

prwhelan May 19, 2025

Uh oh!

prwhelan May 19, 2025

Uh oh!

prwhelan May 19, 2025

Uh oh!

lhoet-google commented May 19, 2025 •

edited

Loading

Uh oh!

lhoet-google commented May 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	var chatInputs = (UnifiedChatInput) inferenceInputs;
	var chatInputs = inferenceInputs.castTo(UnifiedChatInput.class);

Vertexai chatcompletion #1

Vertexai chatcompletion #1

Uh oh!

Conversation

lhoet-google commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

prwhelan Apr 30, 2025

Choose a reason for hiding this comment

Uh oh!

prwhelan Apr 30, 2025

Choose a reason for hiding this comment

Uh oh!

prwhelan May 19, 2025

Choose a reason for hiding this comment

Uh oh!

prwhelan May 19, 2025

Choose a reason for hiding this comment

Uh oh!

prwhelan May 19, 2025

Choose a reason for hiding this comment

Uh oh!

prwhelan May 19, 2025

Choose a reason for hiding this comment

Uh oh!

prwhelan May 19, 2025

Choose a reason for hiding this comment

Uh oh!

lhoet-google commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lhoet-google commented May 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lhoet-google commented Apr 30, 2025 •

edited

Loading

lhoet-google commented May 19, 2025 •

edited

Loading