Skip to content

Feat: add gpt 5 models and verbosity param #4086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,15 @@ public class OpenAiChatOptions implements ToolCallingChatOptions {
*/
private @JsonProperty("reasoning_effort") String reasoningEffort;

/**
* verbosity: string or null
* Optional - Defaults to medium
* Constrains the verbosity of the model's response. Lower values will result in more concise responses, while higher values will result in more verbose responses.
* Currently supported values are low, medium, and high.
* If specified, the model will use web search to find relevant information to answer the user's question.
*/
private @JsonProperty("verbosity") String verbosity;

/**
* This tool searches the web for relevant results to use in a response.
*/
Expand Down Expand Up @@ -268,6 +277,7 @@ public static OpenAiChatOptions fromOptions(OpenAiChatOptions fromOptions) {
.metadata(fromOptions.getMetadata())
.reasoningEffort(fromOptions.getReasoningEffort())
.webSearchOptions(fromOptions.getWebSearchOptions())
.verbosity(fromOptions.getVerbosity())
.build();
}

Expand Down Expand Up @@ -564,6 +574,14 @@ public void setWebSearchOptions(WebSearchOptions webSearchOptions) {
this.webSearchOptions = webSearchOptions;
}

public String getVerbosity() {
return this.verbosity;
}

public void setVerbosity(String verbosity) {
this.verbosity = verbosity;
}

@Override
public OpenAiChatOptions copy() {
return OpenAiChatOptions.fromOptions(this);
Expand Down Expand Up @@ -609,7 +627,8 @@ public boolean equals(Object o) {
&& Objects.equals(this.outputAudio, other.outputAudio) && Objects.equals(this.store, other.store)
&& Objects.equals(this.metadata, other.metadata)
&& Objects.equals(this.reasoningEffort, other.reasoningEffort)
&& Objects.equals(this.webSearchOptions, other.webSearchOptions);
&& Objects.equals(this.webSearchOptions, other.webSearchOptions)
&& Objects.equals(this.verbosity, other.verbosity);
}

@Override
Expand Down Expand Up @@ -802,6 +821,11 @@ public Builder webSearchOptions(WebSearchOptions webSearchOptions) {
return this;
}

public Builder verbosity(String verbosity) {
this.options.verbosity = verbosity;
return this;
}

public OpenAiChatOptions build() {
return this.options;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -482,18 +482,37 @@ public enum ChatModel implements ChatModelDescription {
GPT_5("gpt-5"),

/**
* <b>GPT-5 (2025-08-07)</b> is a specific snapshot of the GPT-5 model from August
* 7, 2025, providing enhanced capabilities for complex reasoning and
* problem-solving tasks.
* GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for
* well-defined tasks and precise prompts.
* <p>
* Note: GPT-5 models require temperature=1.0 (default value). Custom temperature
* values are not supported and will cause errors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not the case anymore? In that case, we need to update the docs changed in this commit: https://github.com/spring-projects/spring-ai/pull/4068/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank youi for pointing out this one @sobychacko . it is the case for all gpt 5 models but not the gpt-5-chat model . https://community.openai.com/t/temperature-in-gpt-5-models/1337133/25
I have update documentation and added tests as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that the gpt 5 models default to a value of temperature 1.0? If so, I think we should make that clear in the docs or javadocs. From what I hear from you, the GPT 5 models completely ignore temperature and internally default to a value of 1.0. Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct. Azure mentions also more params that are not supported, but I am am not sure if that info has to be on spring ai documentation as it is not a limitation of the framework but from the LLM model itself.

The following are currently unsupported with reasoning models:

temperature, top_p, presence_penalty, frequency_penalty, logprobs, top_logprobs, logit_bias, max_tokens

https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/reasoning?tabs=gpt-5%2Cpython-secure%2Cpy#not-supported

* Model ID: gpt-5-mini
* <p>
* See:
* <a href="https://platform.openai.com/docs/models/gpt-5-mini">gpt-5-mini</a>
*/
GPT_5_MINI("gpt-5-mini"),

/**
* GPT-5 Nano is the fastest, cheapest version of GPT-5. It's great for
* summarization and classification tasks.
* <p>
* Model ID: gpt-5-2025-08-07
* Model ID: gpt-5-nano
* <p>
* See: <a href="https://platform.openai.com/docs/models/gpt-5">gpt-5</a>
* See:
* <a href="https://platform.openai.com/docs/models/gpt-5-nano">gpt-5-nano</a>
*/
GPT_5_NANO("gpt-5-nano"),

/**
* GPT-5 Chat points to the GPT-5 snapshot currently used in ChatGPT. GPT-5
* accepts both text and image inputs, and produces text outputs.
* <p>
* Model ID: gpt-5-chat-latest
* <p>
* See: <a href=
* "https://platform.openai.com/docs/models/gpt-5-chat-latest">gpt-5-chat-latest</a>
*/
GPT_5_2025_08_07("gpt-5-2025-08-07"),
GPT_5_CHAT_LATEST("gpt-5-chat-latest"),

/**
* <b>GPT-4o</b> (“o” for “omni”) is the versatile, high-intelligence flagship
Expand Down Expand Up @@ -1064,6 +1083,7 @@ public enum OutputModality {
* Currently supported values are low, medium, and high. Reducing reasoning effort can
* result in faster responses and fewer tokens used on reasoning in a response.
* @param webSearchOptions Options for web search.
* @param verbosity Controls the verbosity of the model's response.
*/
@JsonInclude(Include.NON_NULL)
public record ChatCompletionRequest(// @formatter:off
Expand Down Expand Up @@ -1094,7 +1114,8 @@ public record ChatCompletionRequest(// @formatter:off
@JsonProperty("parallel_tool_calls") Boolean parallelToolCalls,
@JsonProperty("user") String user,
@JsonProperty("reasoning_effort") String reasoningEffort,
@JsonProperty("web_search_options") WebSearchOptions webSearchOptions) {
@JsonProperty("web_search_options") WebSearchOptions webSearchOptions,
@JsonProperty("verbosity") String verbosity) {

/**
* Shortcut constructor for a chat completion request with the given messages, model and temperature.
Expand All @@ -1106,7 +1127,7 @@ public record ChatCompletionRequest(// @formatter:off
public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model, Double temperature) {
this(messages, model, null, null, null, null, null, null, null, null, null, null, null, null, null,
null, null, null, false, null, temperature, null,
null, null, null, null, null, null);
null, null, null, null, null, null, null);
}

/**
Expand All @@ -1120,7 +1141,7 @@ public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model,
this(messages, model, null, null, null, null, null, null,
null, null, null, List.of(OutputModality.AUDIO, OutputModality.TEXT), audio, null, null,
null, null, null, stream, null, null, null,
null, null, null, null, null, null);
null, null, null, null, null, null, null);
}

/**
Expand All @@ -1135,7 +1156,7 @@ public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model,
public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model, Double temperature, boolean stream) {
this(messages, model, null, null, null, null, null, null, null, null, null,
null, null, null, null, null, null, null, stream, null, temperature, null,
null, null, null, null, null, null);
null, null, null, null, null, null, null);
}

/**
Expand All @@ -1151,7 +1172,7 @@ public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model,
List<FunctionTool> tools, Object toolChoice) {
this(messages, model, null, null, null, null, null, null, null, null, null,
null, null, null, null, null, null, null, false, null, 0.8, null,
tools, toolChoice, null, null, null, null);
tools, toolChoice, null, null, null, null, null);
}

/**
Expand All @@ -1164,7 +1185,7 @@ public ChatCompletionRequest(List<ChatCompletionMessage> messages, String model,
public ChatCompletionRequest(List<ChatCompletionMessage> messages, Boolean stream) {
this(messages, null, null, null, null, null, null, null, null, null, null,
null, null, null, null, null, null, null, stream, null, null, null,
null, null, null, null, null, null);
null, null, null, null, null, null, null);
}

/**
Expand All @@ -1177,7 +1198,7 @@ public ChatCompletionRequest streamOptions(StreamOptions streamOptions) {
return new ChatCompletionRequest(this.messages, this.model, this.store, this.metadata, this.frequencyPenalty, this.logitBias, this.logprobs,
this.topLogprobs, this.maxTokens, this.maxCompletionTokens, this.n, this.outputModalities, this.audioParameters, this.presencePenalty,
this.responseFormat, this.seed, this.serviceTier, this.stop, this.stream, streamOptions, this.temperature, this.topP,
this.tools, this.toolChoice, this.parallelToolCalls, this.user, this.reasoningEffort, this.webSearchOptions);
this.tools, this.toolChoice, this.parallelToolCalls, this.user, this.reasoningEffort, this.webSearchOptions, this.verbosity);
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ void validateReasoningTokens() {
"If a train travels 100 miles in 2 hours, what is its average speed?", ChatCompletionMessage.Role.USER);
ChatCompletionRequest request = new ChatCompletionRequest(List.of(userMessage), "o1", null, null, null, null,
null, null, null, null, null, null, null, null, null, null, null, null, false, null, null, null, null,
null, null, null, "low", null);
null, null, null, "low", null, null);
ResponseEntity<ChatCompletion> response = this.openAiApi.chatCompletionEntity(request);

assertThat(response).isNotNull();
Expand Down Expand Up @@ -159,7 +159,7 @@ void streamOutputAudio() {
}

@ParameterizedTest(name = "{0} : {displayName}")
@EnumSource(names = { "GPT_5", "GPT_5_2025_08_07" })
@EnumSource(names = { "GPT_5", "GPT_5_CHAT_LATEST", "GPT_5_MINI", "GPT_5_NANO" })
void chatCompletionEntityWithNewModels(OpenAiApi.ChatModel modelName) {
ChatCompletionMessage chatCompletionMessage = new ChatCompletionMessage("Hello world", Role.USER);
ResponseEntity<ChatCompletion> response = this.openAiApi.chatCompletionEntity(
Expand All @@ -172,4 +172,50 @@ void chatCompletionEntityWithNewModels(OpenAiApi.ChatModel modelName) {
assertThat(response.getBody().model()).containsIgnoringCase(modelName.getValue());
}

@ParameterizedTest(name = "{0} : {displayName}")
@EnumSource(names = { "GPT_5_NANO" })
void chatCompletionEntityWithNewModelsAndLowVerbosity(OpenAiApi.ChatModel modelName) {
ChatCompletionMessage chatCompletionMessage = new ChatCompletionMessage(
"What is the answer to the ultimate question of life, the universe, and everything?", Role.USER);

ChatCompletionRequest request = new ChatCompletionRequest(List.of(chatCompletionMessage), // messages
modelName.getValue(), null, null, null, null, null, null, null, null, null, null, null, null, null,
null, null, null, false, null, 1.0, null, null, null, null, null, null, null, "low");

ResponseEntity<ChatCompletion> response = this.openAiApi.chatCompletionEntity(request);

assertThat(response).isNotNull();
assertThat(response.getBody()).isNotNull();
assertThat(response.getBody().choices()).isNotEmpty();
assertThat(response.getBody().choices().get(0).message().content()).isNotEmpty();
assertThat(response.getBody().model()).containsIgnoringCase(modelName.getValue());
}

@ParameterizedTest(name = "{0} : {displayName}")
@EnumSource(names = { "GPT_5", "GPT_5_MINI", "GPT_5_NANO" })
void chatCompletionEntityWithGpt5ModelsAndTemperatureShouldFail(OpenAiApi.ChatModel modelName) {
ChatCompletionMessage chatCompletionMessage = new ChatCompletionMessage("Hello world", Role.USER);
ChatCompletionRequest request = new ChatCompletionRequest(List.of(chatCompletionMessage), modelName.getValue(),
0.8);

assertThatThrownBy(() -> this.openAiApi.chatCompletionEntity(request)).isInstanceOf(RuntimeException.class)
.hasMessageContaining("Unsupported value");
}

@ParameterizedTest(name = "{0} : {displayName}")
@EnumSource(names = { "GPT_5_CHAT_LATEST" })
void chatCompletionEntityWithGpt5ChatAndTemperatureShouldSucceed(OpenAiApi.ChatModel modelName) {
ChatCompletionMessage chatCompletionMessage = new ChatCompletionMessage("Hello world", Role.USER);
ChatCompletionRequest request = new ChatCompletionRequest(List.of(chatCompletionMessage), modelName.getValue(),
0.8);

ResponseEntity<ChatCompletion> response = this.openAiApi.chatCompletionEntity(request);

assertThat(response).isNotNull();
assertThat(response.getBody()).isNotNull();
assertThat(response.getBody().choices()).isNotEmpty();
assertThat(response.getBody().choices().get(0).message().content()).isNotEmpty();
assertThat(response.getBody().model()).containsIgnoringCase(modelName.getValue());
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,10 @@ The `JSON_SCHEMA` type enables link:https://platform.openai.com/docs/guides/stru

[NOTE]
====
When using GPT-5 models (`gpt-5`, `gpt-5-2025-08-07`), the temperature parameter must be set to `1.0` (the default value). These models do not support custom temperature values and will return an error if any other temperature value is specified.
When using GPT-5 models such as `gpt-5`, `gpt-5-mini`, and `gpt-5-nano`, the `temperature` parameter is not supported.
These models are optimized for reasoning and do not use temperature.
Specifying a temperature value will result in an error.
In contrast, conversational models like `gpt-5-chat` do support the `temperature` parameter.
====

NOTE: You can override the common `spring.ai.openai.base-url` and `spring.ai.openai.api-key` for the `ChatModel` and `EmbeddingModel` implementations.
Expand Down