forked from elastic/elasticsearch
-
Notifications
You must be signed in to change notification settings - Fork 1
Vertexai chatcompletion #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
d429a31
VertexAI chat completion response entity with tests
lhoet-google 00bfdb0
Modified build gradle to include google vertexai sdk
lhoet-google 2378270
Google vertex ai chat completion model with tests
lhoet-google 1f00974
Google vertex ai chat completion request with tests
lhoet-google 970ab3c
TransportVersion
lhoet-google 5428074
ChatCompletion TaskSettings & ServiceSettings
lhoet-google ee44f22
ChatCompletionRequestManager & tests
lhoet-google 8160c2b
VertexAI Service and related classes. WIP & missing tests
lhoet-google ff68fbe
VertexAi ChatCompletion task settings fix.
lhoet-google 29c7093
JsonArrayParts event processor & parser
lhoet-google bfd75b0
AI Service and service tests
lhoet-google 2ebfac9
Unified chat completion response and request handlers. Also working w…
lhoet-google 679ea80
StreamingProcessor now support tools. Added more tests
lhoet-google e611cc3
More tests for streaming processor
lhoet-google 87e428a
Request entity tests
lhoet-google 193d06d
Google vertexai unified chat completion entity now accepting tools an…
lhoet-google 813a2e8
Serializing function call message
lhoet-google f1ab8cc
Response handler with tests
lhoet-google 23c7d92
VertexAI chat completion req entity bugfixes
lhoet-google c45d23f
Bugfix in vertex ai unified chat completion req entity
lhoet-google a820d83
Bugfix in vertex ai unified streaming processor
lhoet-google d2f09cf
Removed google aiplatform sdk
lhoet-google bda94de
Renamed file to match class name for JsonArrayPartsEventParser
lhoet-google 5dee072
Updated rate limit settings for vertex ai
lhoet-google 2f75788
Deleted GoogleVertexAiChatCompletionTaskSettings
lhoet-google b50c911
VertexAI Unified chat completion request tests
lhoet-google d6ae90f
Fixed some tests
lhoet-google cbb387f
Fixed GoogleAIService get configuration tests
lhoet-google 7e1c970
GoogleVertexAiCompletion action tests
lhoet-google 5ab716f
Formatting
lhoet-google 28aa464
Code style fix
lhoet-google 2279391
Removed unnused variables
lhoet-google 85af5c0
Function call id fixed
lhoet-google 16c01b0
Bugfix
lhoet-google 1732244
Merge branch 'main' into vertexai-chatcompletion
lhoet-google 6cc165b
Testfix
lhoet-google c020122
Unit tests
beltrangs 7821d58
Merge branch 'vertexai-chatcompletion' into google-chat-completion-tests
beltrangs 8633659
Update ElasticInferenceServiceTests.java
beltrangs 06020cc
Update GoogleVertexAiServiceTests.java
beltrangs 5a2cfe5
Merge pull request #2 from beltrangslilly/google-chat-completion-tests
leo-hoet 0cf1f3f
Merge branch 'vertexai-chatcompletion' of github.com:lhoet-google/ela…
lhoet-google File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
86 changes: 86 additions & 0 deletions
86
.../elasticsearch/xpack/inference/external/response/streaming/JsonArrayPartsEventParser.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,86 @@ | ||
| /* | ||
| * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| * or more contributor license agreements. Licensed under the Elastic License | ||
| * 2.0; you may not use this file except in compliance with the Elastic License | ||
| * 2.0. | ||
| */ | ||
|
|
||
| package org.elasticsearch.xpack.inference.external.response.streaming; | ||
|
|
||
| import java.io.ByteArrayOutputStream; | ||
| import java.io.IOException; | ||
| import java.io.UncheckedIOException; | ||
| import java.util.ArrayDeque; | ||
| import java.util.Arrays; | ||
| import java.util.Deque; | ||
|
|
||
| /** | ||
| * Parses a stream of bytes that form a JSON array, where each element of the array | ||
| * is a JSON object. This parser extracts each complete JSON object from the array | ||
| * and emits it as byte array. | ||
| * | ||
| * Example of an expected stream: | ||
| * Chunk 1: [{"key":"val1"} | ||
| * Chunk 2: ,{"key2":"val2"} | ||
| * Chunk 3: ,{"key3":"val3"}, {"some":"object"}] | ||
| * | ||
| * This parser would emit four byte arrays, with data: | ||
| * 1. {"key":"val1"} | ||
| * 2. {"key2":"val2"} | ||
| * 3. {"key3":"val3"} | ||
| * 4. {"some":"object"} | ||
| */ | ||
| public class JsonArrayPartsEventParser { | ||
|
|
||
| // Buffer to hold bytes from the previous call if they formed an incomplete JSON object. | ||
| private final ByteArrayOutputStream incompletePart = new ByteArrayOutputStream(); | ||
|
|
||
| public Deque<byte[]> parse(byte[] newBytes) { | ||
| if (newBytes == null || newBytes.length == 0) { | ||
| return new ArrayDeque<>(0); | ||
| } | ||
|
|
||
| ByteArrayOutputStream currentStream = new ByteArrayOutputStream(); | ||
| try { | ||
| currentStream.write(incompletePart.toByteArray()); | ||
| currentStream.write(newBytes); | ||
| } catch (IOException e) { | ||
| throw new UncheckedIOException("Error handling byte array streams", e); | ||
| } | ||
| incompletePart.reset(); | ||
|
|
||
| byte[] dataToProcess = currentStream.toByteArray(); | ||
| return parseInternal(dataToProcess); | ||
| } | ||
|
|
||
| private Deque<byte[]> parseInternal(byte[] data) { | ||
| int localBraceLevel = 0; | ||
| int objectStartIndex = -1; | ||
| Deque<byte[]> completedObjects = new ArrayDeque<>(); | ||
|
|
||
| for (int i = 0; i < data.length; i++) { | ||
| char c = (char) data[i]; | ||
|
|
||
| if (c == '{') { | ||
| if (localBraceLevel == 0) { | ||
| objectStartIndex = i; | ||
| } | ||
| localBraceLevel++; | ||
| } else if (c == '}') { | ||
| if (localBraceLevel > 0) { | ||
| localBraceLevel--; | ||
| if (localBraceLevel == 0) { | ||
| byte[] jsonObject = Arrays.copyOfRange(data, objectStartIndex, i + 1); | ||
| completedObjects.offer(jsonObject); | ||
| objectStartIndex = -1; | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if (localBraceLevel > 0) { | ||
| incompletePart.write(data, objectStartIndex, data.length - objectStartIndex); | ||
| } | ||
| return completedObjects; | ||
| } | ||
| } |
37 changes: 37 additions & 0 deletions
37
...asticsearch/xpack/inference/external/response/streaming/JsonArrayPartsEventProcessor.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| /* | ||
| * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| * or more contributor license agreements. Licensed under the Elastic License | ||
| * 2.0; you may not use this file except in compliance with the Elastic License | ||
| * 2.0. | ||
| */ | ||
|
|
||
| package org.elasticsearch.xpack.inference.external.response.streaming; | ||
|
|
||
| import org.elasticsearch.xpack.inference.common.DelegatingProcessor; | ||
| import org.elasticsearch.xpack.inference.external.http.HttpResult; | ||
|
|
||
| import java.util.Deque; | ||
|
|
||
| public class JsonArrayPartsEventProcessor extends DelegatingProcessor<HttpResult, Deque<byte[]>> { | ||
| private final JsonArrayPartsEventParser jsonArrayPartsEventParser; | ||
|
|
||
| public JsonArrayPartsEventProcessor(JsonArrayPartsEventParser jsonArrayPartsEventParser) { | ||
| this.jsonArrayPartsEventParser = jsonArrayPartsEventParser; | ||
| } | ||
|
|
||
| @Override | ||
| public void next(HttpResult item) { | ||
| if (item.isBodyEmpty()) { | ||
| upstream().request(1); | ||
| return; | ||
| } | ||
|
|
||
| var response = jsonArrayPartsEventParser.parse(item.body()); | ||
| if (response.isEmpty()) { | ||
| upstream().request(1); | ||
| return; | ||
| } | ||
|
|
||
| downstream().onNext(response); | ||
| } | ||
| } |
73 changes: 73 additions & 0 deletions
73
...earch/xpack/inference/services/googlevertexai/GoogleVertexAiCompletionRequestManager.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| /* | ||
| * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| * or more contributor license agreements. Licensed under the Elastic License | ||
| * 2.0; you may not use this file except in compliance with the Elastic License | ||
| * 2.0. | ||
| */ | ||
|
|
||
| package org.elasticsearch.xpack.inference.services.googlevertexai; | ||
|
|
||
| import org.apache.logging.log4j.LogManager; | ||
| import org.apache.logging.log4j.Logger; | ||
| import org.elasticsearch.action.ActionListener; | ||
| import org.elasticsearch.inference.InferenceServiceResults; | ||
| import org.elasticsearch.threadpool.ThreadPool; | ||
| import org.elasticsearch.xpack.inference.external.http.retry.RequestSender; | ||
| import org.elasticsearch.xpack.inference.external.http.retry.ResponseHandler; | ||
| import org.elasticsearch.xpack.inference.external.http.sender.ExecutableInferenceRequest; | ||
| import org.elasticsearch.xpack.inference.external.http.sender.InferenceInputs; | ||
| import org.elasticsearch.xpack.inference.external.http.sender.UnifiedChatInput; | ||
| import org.elasticsearch.xpack.inference.services.googlevertexai.completion.GoogleVertexAiChatCompletionModel; | ||
| import org.elasticsearch.xpack.inference.services.googlevertexai.request.GoogleVertexAiUnifiedChatCompletionRequest; | ||
| import org.elasticsearch.xpack.inference.services.googlevertexai.response.GoogleVertexAiChatCompletionResponseEntity; | ||
|
|
||
| import java.util.Objects; | ||
| import java.util.function.Supplier; | ||
|
|
||
| public class GoogleVertexAiCompletionRequestManager extends GoogleVertexAiRequestManager { | ||
|
|
||
| private static final Logger logger = LogManager.getLogger(GoogleVertexAiCompletionRequestManager.class); | ||
|
|
||
| private static final ResponseHandler HANDLER = createGoogleVertexAiResponseHandler(); | ||
|
|
||
| private static ResponseHandler createGoogleVertexAiResponseHandler() { | ||
| return new GoogleVertexAiUnifiedChatCompletionResponseHandler( | ||
| "Google Vertex AI chat completion", | ||
| GoogleVertexAiChatCompletionResponseEntity::fromResponse | ||
| ); | ||
| } | ||
|
|
||
| private final GoogleVertexAiChatCompletionModel model; | ||
|
|
||
| public GoogleVertexAiCompletionRequestManager(GoogleVertexAiChatCompletionModel model, ThreadPool threadPool) { | ||
| super(threadPool, model, RateLimitGrouping.of(model)); | ||
| this.model = model; | ||
| } | ||
|
|
||
| record RateLimitGrouping(int projectIdHash) { | ||
| public static RateLimitGrouping of(GoogleVertexAiChatCompletionModel model) { | ||
| Objects.requireNonNull(model); | ||
| return new RateLimitGrouping(model.rateLimitServiceSettings().projectId().hashCode()); | ||
| } | ||
| } | ||
|
|
||
| public static GoogleVertexAiCompletionRequestManager of(GoogleVertexAiChatCompletionModel model, ThreadPool threadPool) { | ||
| Objects.requireNonNull(model); | ||
| Objects.requireNonNull(threadPool); | ||
|
|
||
| return new GoogleVertexAiCompletionRequestManager(model, threadPool); | ||
| } | ||
|
|
||
| @Override | ||
| public void execute( | ||
| InferenceInputs inferenceInputs, | ||
| RequestSender requestSender, | ||
| Supplier<Boolean> hasRequestCompletedFunction, | ||
| ActionListener<InferenceServiceResults> listener | ||
| ) { | ||
|
|
||
| var chatInputs = (UnifiedChatInput) inferenceInputs; | ||
| var request = new GoogleVertexAiUnifiedChatCompletionRequest(chatInputs, model); | ||
| execute(new ExecutableInferenceRequest(requestSender, logger, request, HANDLER, hasRequestCompletedFunction, listener)); | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the types are somehow wrong, this will throw a decorated
IllegalArgumentExceptionrather than theClassCastException