Skip to content

Commit 7abbaf0

Browse files
maxhniebergallelasticsearchmachine
andauthored
[8.x] [Inference API] Add unified api for chat completions (#117589) (#118772)
* [Inference API] Add unified api for chat completions (#117589) * Adding some shell classes * modeling the request objects * Writeable changes to schema * Working parsing tests * Creating a new action * Add outbound request writing (WIP) * Improvements to request serialization * Adding separate transport classes * separate out unified request and combine inputs * Reworking unified inputs * Adding unsupported operation calls * Fixing parsing logic * get the build working * Update docs/changelog/117589.yaml * Fixing injection issue * Allowing model to be overridden but not working yet * Fixing issues * Switch field name for tool * Add suport for toolCalls and refusal in streaming completion * Working tool call response * Separate unified and legacy code paths * Updated the parser, but there are some class cast exceptions to fix * Refactoring tests and request entities * Parse response from OpenAI * Removing unused request classes * precommit * Adding tests for UnifiedCompletionAction Request * Refactoring stop to be a list of strings * Testing for OpenAI response parsing * Refactoring transport action tests to test unified validation code * Fixing various tests * Fixing license header * Reformat streaming results * Finalize response format * remove debug logs * remove changes for debugging * Task type and base inference action tests * Adding openai service tests * Adding model tests * tests for StreamingUnifiedChatCompletionResultsTests toXContentChunked * Fixing change log and removing commented out code * Switch usage to accept null * Adding test for TestStreamingCompletionServiceExtension * Avoid serializing empty lists + request entity tests * Register named writeables from UnifiedCompletionRequest * Removing commented code * Clean up and add more of an explination * remove duplicate test * remove old todos * Refactoring some duplication * Adding javadoc * Addressing feedback --------- Co-authored-by: Jonathan Buttner <[email protected]> Co-authored-by: Jonathan Buttner <[email protected]> (cherry picked from commit 467fdb8) # Conflicts: # x-pack/plugin/inference/qa/inference-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceCrudIT.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/action/TransportInferenceAction.java * fix merge conflicts * formatting * Remove tests - retain feature flag * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <[email protected]>
1 parent 0a599f9 commit 7abbaf0

File tree

106 files changed

+5500
-875
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

106 files changed

+5500
-875
lines changed

docs/changelog/117589.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 117589
2+
summary: "Add Inference Unified API for chat completions for OpenAI"
3+
area: Machine Learning
4+
type: enhancement
5+
issues: []

server/src/main/java/org/elasticsearch/common/xcontent/ChunkedToXContentHelper.java

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
import org.elasticsearch.common.collect.Iterators;
1313
import org.elasticsearch.xcontent.ToXContent;
1414

15+
import java.util.Collections;
1516
import java.util.Iterator;
1617

1718
public enum ChunkedToXContentHelper {
@@ -53,6 +54,14 @@ public static Iterator<ToXContent> field(String name, String value) {
5354
return Iterators.single(((builder, params) -> builder.field(name, value)));
5455
}
5556

57+
public static Iterator<ToXContent> optionalField(String name, String value) {
58+
if (value == null) {
59+
return Collections.emptyIterator();
60+
} else {
61+
return field(name, value);
62+
}
63+
}
64+
5665
/**
5766
* Creates an Iterator of a single ToXContent object that serializes the given object as a single chunk. Just wraps {@link
5867
* Iterators#single}, but still useful because it avoids any type ambiguity.

server/src/main/java/org/elasticsearch/inference/InferenceService.java

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,23 @@ void infer(
112112
);
113113

114114
/**
115+
* Perform completion inference on the model using the unified schema.
116+
*
117+
* @param model The model
118+
* @param request Parameters for the request
119+
* @param timeout The timeout for the request
120+
* @param listener Inference result listener
121+
*/
122+
void unifiedCompletionInfer(
123+
Model model,
124+
UnifiedCompletionRequest request,
125+
TimeValue timeout,
126+
ActionListener<InferenceServiceResults> listener
127+
);
128+
129+
/**
130+
* Chunk long text.
131+
*
115132
* @param model The model
116133
* @param query Inference query, mainly for re-ranking
117134
* @param input Inference input

server/src/main/java/org/elasticsearch/inference/TaskType.java

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,10 @@ public static TaskType fromString(String name) {
3838
}
3939

4040
public static TaskType fromStringOrStatusException(String name) {
41+
if (name == null) {
42+
throw new ElasticsearchStatusException("Task type must not be null", RestStatus.BAD_REQUEST);
43+
}
44+
4145
try {
4246
TaskType taskType = TaskType.fromString(name);
4347
return Objects.requireNonNull(taskType);

0 commit comments

Comments
 (0)