-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ML] Support Gemini thinking budget in inference API #133599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Support Gemini thinking budget in inference API #133599
Conversation
Adding support for configuring the thinkingBudget for Gemini 2.5 models when creating chat completion inference endpoints. The thinking_budget field is nested inside the thinking_config object in service_settings. - Added ThinkingConfig class to contain the thinking_budget field. This results in a less flat structure for the PUT _inference/chat_completion/ call but will make adding support for include_thoughts easier in future - Added extractOptionalInteger() method to ServiceUtils - Unit tests for ThinkingConfig class - Updated existing tests to account for the new object and field These changes enable elastic/kibana#227590 to be completed
Pinging @elastic/ml-core (Team:ML) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I left a few comments.
...ava/org/elasticsearch/xpack/inference/services/googlevertexai/completion/ThinkingConfig.java
Show resolved
Hide resolved
...ava/org/elasticsearch/xpack/inference/services/googlevertexai/completion/ThinkingConfig.java
Outdated
Show resolved
Hide resolved
...ava/org/elasticsearch/xpack/inference/services/googlevertexai/completion/ThinkingConfig.java
Outdated
Show resolved
Hide resolved
...nference/services/googlevertexai/completion/GoogleVertexAiChatCompletionServiceSettings.java
Outdated
Show resolved
Hide resolved
...icsearch/xpack/inference/services/googleaistudio/request/completion/ThinkingConfigTests.java
Outdated
Show resolved
Hide resolved
...icsearch/xpack/inference/services/googleaistudio/request/completion/ThinkingConfigTests.java
Outdated
Show resolved
Hide resolved
...e/services/googlevertexai/request/GoogleVertexAiUnifiedChatCompletionRequestEntityTests.java
Outdated
Show resolved
Hide resolved
.../plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/ServiceUtils.java
Show resolved
Hide resolved
- Move transport version checks from ThinkingConfig to GoogleVertexAiChatCompletionServiceSettings - Remove default value argument from ThinkingConfig.of() - Add test coverage for ServiceUtils.extractOptionalPositiveInteger() and extractOptionalInteger()
Just wanted to capture our discussion earlier this week. After thinking about this more I think it probably makes sense to move the thinking budget settings to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great, just a few comments.
...k/inference/services/googlevertexai/completion/GoogleVertexAiChatCompletionTaskSettings.java
Outdated
Show resolved
Hide resolved
...k/inference/services/googlevertexai/completion/GoogleVertexAiChatCompletionTaskSettings.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Adding support for configuring the thinkingBudget for Gemini 2.5 models when creating chat completion inference endpoints. The thinking_budget field is nested inside the thinking_config object in service_settings.
These changes enable elastic/kibana#227590 to be completed
Specification PR: elastic/elasticsearch-specification#5257
Example usage: