-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Implement OpenShift AI integration for chat completion, embeddings, and reranking #136624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement OpenShift AI integration for chat completion, embeddings, and reranking #136624
Conversation
… names and add changelog
…t AI chat completion
…ogic and update dimensionsSetByUser handling
…ax tokens and add unit tests for request creation and validation
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.3.csv
…ServiceTests for improved readability and accuracy
|
Hello @jonathan-buttner @dan-rubinstein @DonalEvans |
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.3.csv
…larity in OpenShift AI integration
|
Hi @DonalEvans |
DonalEvans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few small changes, thanks for addressing all of the other comments!
| } | ||
|
|
||
| public void testParseRequestConfig_CreatesAnEmbeddingsModelWhenChunkingSettingsProvided() throws IOException { | ||
| var chunkingSettingsMap = createRandomChunkingSettings(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This object is an instance of ChunkingSettings rather than a Map so the name is a little misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Fixed.
...est/java/org/elasticsearch/xpack/inference/services/openshiftai/OpenShiftAiServiceTests.java
Show resolved
Hide resolved
| ); | ||
| var action = actionCreator.create( | ||
| model, | ||
| new HashMap<>(Map.of(OpenShiftAiRerankTaskSettings.RETURN_DOCUMENTS, false, OpenShiftAiRerankTaskSettings.TOP_N, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the values of false and 1 be extracted to local variables here so they can be reused on line 698?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the constants with understandable names.
|
|
||
| PlainActionFuture<InferenceServiceResults> listener = new PlainActionFuture<>(); | ||
| action.execute( | ||
| new QueryAndDocsInputs(QUERY_VALUE, DOCUMENTS_VALUE, true, 2, false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values true and 2 can be extracted to local variables and reused on line 805.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the constants with understandable names.
|
|
||
| PlainActionFuture<InferenceServiceResults> listener = new PlainActionFuture<>(); | ||
| action.execute( | ||
| new QueryAndDocsInputs(QUERY_VALUE, documents, false, 1, false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false and 1 can be extracted to local variables and reused on line 849.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the constants with understandable names.
Also added test case for absent model, removed redundant documents local variables in favor of constant values, removed forbidden String.formatted API calls.
|
@DonalEvans your comments are addressed. PR is ready to be reviewed once more. Also could you please add yourself as assignee and add appropriate enhancement type label |
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.3.csv
DonalEvans
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, was a bit premature in my approval, the InferenceGetServicesIT class needs to be updated to include OpenShift AI in the appropriate lists of expected services.
|
Thanks @DonalEvans. Added |
There's a known issue causing failures in serverless tests right now, the test in question should have been muted, but it might take a little while for the checks to start passing. |
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.3.csv
…-json * upstream/main: (158 commits) Cleanup files from repo root folder (elastic#138030) Implement OpenShift AI integration for chat completion, embeddings, and reranking (elastic#136624) Optimize AsyncSearchErrorTraceIT to avoid failures (elastic#137716) Removes support for null TransportService in RemoteClusterService (elastic#137939) Mute org.elasticsearch.index.mapper.DateFieldMapperTests testSortShortcuts elastic#138018 rest-api-spec: fix type of enums (elastic#137521) Update Gradle wrapper to 9.2.0 (elastic#136155) Add RCS Strong Verification Documentation (elastic#137822) Use docvalue skippers on dimension fields (elastic#137029) Introduce INDEX_SHARD_COUNT_FORMAT (elastic#137210) Mute org.elasticsearch.xpack.inference.integration.AuthorizationTaskExecutorIT testCreatesChatCompletion_AndThenCreatesTextEmbedding elastic#138012 Fix ES|QL search context creation to use correct results type (elastic#137994) Improve Snapshot Logging (elastic#137470) Support extra output field in TOP function (elastic#135434) Remove NumericDoubleValues class (elastic#137884) [ML] Fix ML calendar event update scalability issues (elastic#136886) Task may be unregistered outside of the trace context in exceptional cases. (elastic#137865) Refine workaround for S3 repo analysis known issue (elastic#138000) Additional DEBUG logging on authc failures (elastic#137941) Cleanup index resolution (elastic#137867) ...
Creation of new OpenShift AI inference provider integration allowing
tasks to be executed as part of inference API with openshiftai provider.
Changes were tested locally against next models:
Test results:
EMBEDDINGS
Create Embeddings Endpoint
Create Embeddings Endpoint (404)
Perform Embeddings
COMPLETION
Create Completion Endpoint
Create Completion Endpoint (Redirection error)
Perform Non-Streaming Completion
Perform Streaming Completion
CHAT COMPLETION
Create Chat Completion Endpoint
Perform Basic Chat Completion
Perform Tool Call Chat Completion
RERANK
Create Rerank Endpoint
Perform Rerank
gradle check?