diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/mistralai-chat-functions.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/mistralai-chat-functions.adoc index 4540580b778..cdbbc1f8ff5 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/mistralai-chat-functions.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/mistralai-chat-functions.adoc @@ -4,9 +4,7 @@ You can register custom Java functions with the `MistralAiChatModel` and have th This allows you to connect the LLM capabilities with external tools and APIs. The `open-mixtral-8x22b`, `mistral-small-latest`, and `mistral-large-latest` models are trained to detect when a function should be called and to respond with JSON that adheres to the function signature. -The MistralAI API does not call the function directly; instead, the model generates JSON that you can use to call the function in your code and return the result back to the model to complete the conversation. - -NOTE: As of March 13, 2024, Mistral AI has integrated support for parallel function calling into their `mistral-large-latest` model, a feature that was absent at the time of the first Spring AI Mistral AI. +The Mistral AI API does not call the function directly; instead, the model generates JSON that you can use to call the function in your code and return the result back to the model to complete the conversation. Spring AI provides flexible and user-friendly ways to register and call custom functions. In general, the custom functions need to provide a function `name`, `description`, and the function call `signature` (as JSON schema) to let the model know what arguments the function expects. @@ -22,12 +20,12 @@ The basis of the underlying infrastructure is the link:https://github.com/spring == How it works -Suppose we want the AI model to respond with information that it does not have, for example the current temperature at a given location. +Suppose we want the AI model to respond with information that it does not have, for example, the current temperature at a given location. We can provide the AI model with metadata about our own functions that it can use to retrieve that information as it processes your prompt. -For example, if during the processing of a prompt, the AI Model determines that it needs additional information about the temperature in a given location, it will start a server side generated request/response interaction. The AI Model invokes a client side function. -The AI Model provides method invocation details as JSON and it is the responsibility of the client to execute that function and return the response. +For example, if during the processing of a prompt, the AI Model determines that it needs additional information about the temperature in a given location, it will start a server-side generated request/response interaction. The AI Model invokes a client side function. +The AI Model provides method invocation details as JSON, and it is the responsibility of the client to execute that function and return the response. Spring AI greatly simplifies the code you need to write to support function invocation. It brokers the function invocation conversation for you. @@ -39,10 +37,10 @@ You can also reference multiple function bean names in your prompt. Let's create a chatbot that answer questions by calling our own function. To support the response of the chatbot, we will register our own function that takes a location and returns the current weather in that location. -When the response to the prompt to the model needs to answer a question such as `"What’s the weather like in Boston?"` the AI model will invoke the client providing the location value as an argument to be passed to the function. This RPC-like data is passed as JSON. +When the model needs to answer a question such as `"What’s the weather like in Boston?"` the AI model will invoke the client providing the location value as an argument to be passed to the function. This RPC-like data is passed as JSON. -Our function calls some SaaS based weather service API and returns the weather response back to the model to complete the conversation. -In this example we will use a simple implementation named `MockWeatherService` that hard codes the temperature for various locations. +Our function calls some SaaS-based weather service API and returns the weather response back to the model to complete the conversation. +In this example, we will use a simple implementation named `MockWeatherService` that hard-codes the temperature for various locations. The following `MockWeatherService.java` represents the weather service API: @@ -64,16 +62,15 @@ public class MockWeatherService implements Function { With the link:../mistralai-chat.html#_auto_configuration[MistralAiChatModel Auto-Configuration] you have multiple ways to register custom functions as beans in the Spring context. -We start with describing the most POJO friendly options. +We start by describing the most POJO-friendly options. ==== Plain Java Functions -In this approach you define `@Beans` in your application context as you would any other Spring managed object. +In this approach, you define a `@Bean` in your application context as you would any other Spring managed object. -Internally, Spring AI `ChatModel` will create an instance of a `FunctionCallbackWrapper` wrapper that adds the logic for it being invoked via the AI model. +Internally, Spring AI `ChatModel` will create an instance of a `FunctionCallbackWrapper` that adds the logic for it being invoked via the AI model. The name of the `@Bean` is passed as a `ChatOption`. - [source,java] ---- @Configuration @@ -81,32 +78,31 @@ static class Config { @Bean @Description("Get the weather in location") // function description - public Function weatherFunction1() { + public Function currentWeather() { return new MockWeatherService(); } - ... + } ---- -The `@Description` annotation is optional and provides a function description (2) that helps the model understand when to call the function. +The `@Description` annotation is optional and provides a function description that helps the model understand when to call the function. It is an important property to set to help the AI model determine what client side function to invoke. -Another option to provide the description of the function is the `@JsonClassDescription` annotation on the `MockWeatherService.Request` to provide the function description: +Another option for providing the description of the function is to use the `@JsonClassDescription` annotation on the `MockWeatherService.Request`: [source,java] ---- - @Configuration static class Config { @Bean - public Function currentWeather3() { // (1) bean name as function name. + public Function currentWeather() { // bean name as function name return new MockWeatherService(); } - ... + } -@JsonClassDescription("Get the weather in location") // (2) function description +@JsonClassDescription("Get the weather in location") // // function description public record Request(String location, Unit unit) {} ---- @@ -114,13 +110,12 @@ It is a best practice to annotate the request object with information such that The link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/mistralai/tool/PaymentStatusBeanIT.java[PaymentStatusBeanIT.java] demonstrates this approach. -TIP: The link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/mistralai/tool/PaymentStatusBeanOpenAiIT[PaymentStatusBeanOpenAiIT] implements the same function using the OpenAI API. -MistralAI is almost identical to OpenAI in this regard. - +TIP: The Mistral AI link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/mistralai/tool/PaymentStatusBeanOpenAiIT[PaymentStatusBeanOpenAiIT] implements the same function using the OpenAI API. +Mistral AI is almost identical to OpenAI in this regard. ==== FunctionCallback Wrapper -Another way to register a function is to create `FunctionCallbackWrapper` wrapper like this: +Another way to register a function is to create a `FunctionCallbackWrapper` like this: [source,java] ---- @@ -135,14 +130,14 @@ static class Config { .withDescription("Get the weather in location") // (2) function description .build(); } - ... + } ---- It wraps the 3rd party `MockWeatherService` function and registers it as a `CurrentWeather` function with the `MistralAiChatModel`. -It also provides a description (2) and an optional response converter (3) to convert the response into a text as expected by the model. +It also provides a description (2) and an optional response converter to convert the response into a text as expected by the model. -NOTE: By default, the response converter does a JSON serialization of the Response object. +NOTE: By default, the response converter performs a JSON serialization of the Response object. NOTE: The `FunctionCallbackWrapper` internally resolves the function call signature based on the `MockWeatherService.Request` class. @@ -156,19 +151,19 @@ MistralAiChatModel chatModel = ... UserMessage userMessage = new UserMessage("What's the weather like in Paris?"); -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage), - MistralAiChatOptions.builder().withFunction("CurrentWeather").build())); // (1) Enable the function +ChatResponse response = chatModel.call(new Prompt(userMessage, + MistralAiChatOptions.builder().withFunction("CurrentWeather").build())); // Enable the function logger.info("Response: {}", response); ---- -// NOTE: You can can have multiple functions registered in your `ChatModel` but only those enabled in the prompt request will be considered for the function calling. +// NOTE: You can have multiple functions registered in your `ChatModel` but only those enabled in the prompt request will be considered for the function calling. -Above user question will trigger 3 calls to `CurrentWeather` function (one for each city) and produce the final response. +The above user question will trigger 3 calls to the `CurrentWeather` function (one for each city) and the final response will be something like this: === Register/Call Functions with Prompt Options -In addition to the auto-configuration you can register callback functions, dynamically, with your Prompt requests: +In addition to the auto-configuration, you can register callback functions, dynamically, with your `Prompt` requests: [source,java] ---- @@ -183,16 +178,15 @@ var promptOptions = MistralAiChatOptions.builder() new MockWeatherService()))) // function code .build(); -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage), promptOptions)); +ChatResponse response = chatModel.call(new Prompt(userMessage, promptOptions)); ---- NOTE: The in-prompt registered functions are enabled by default for the duration of this request. -This approach allows to dynamically chose different functions to be called based on the user input. +This approach allows to choose dynamically different functions to be called based on the user input. The https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/mistralai/tool/PaymentStatusPromptIT.java[PaymentStatusPromptIT.java] integration test provides a complete example of how to register a function with the `MistralAiChatModel` and use it in a prompt request. - == Appendices === https://spring.io/blog/2024/03/06/function-calling-in-java-and-spring-ai-using-the-latest-mistral-ai-api[(Blog) Function Calling in Java and Spring AI using the latest Mistral AI API] diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/ollama-chat-functions.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/ollama-chat-functions.adoc index afa756ce78e..afecc0efb75 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/ollama-chat-functions.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/ollama-chat-functions.adoc @@ -6,11 +6,11 @@ TIP: You need https://ollama.com/search?c=tools[Models] pre-trained for Tools su Usually, such models are tagged with a `Tools` tag. For example `mistral`, `firefunction-v2` or `llama3.1:70b`. -NOTE: Currently, the Ollama API (0.2.8) does not support function calling in streaming mode. +NOTE: Currently, the Ollama API (0.3.8) does not support function calling in streaming mode. You can register custom Java functions with the `OllamaChatModel` and have the Ollama deployed model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. This allows you to connect the LLM capabilities with external tools and APIs. -The Ollama models tagged with the `Tools` label are trained to detect when a function should be called and to respond with JSON that adheres to the function signature. +The Ollama models tagged with the `Tools` label (see https://ollama.com/search?c=tools[full list]) are trained to detect when a function should be called and to respond with JSON that adheres to the function signature. The Ollama API does not call the function directly; instead, the model generates JSON that you can use to call the function in your code and return the result back to the model to complete the conversation. Spring AI provides flexible and user-friendly ways to register and call custom functions. @@ -25,16 +25,14 @@ Spring AI makes this as easy as defining a `@Bean` definition that returns a `ja Under the hood, Spring wraps your POJO (the function) with the appropriate adapter code that enables interaction with the AI Model, saving you from writing tedious boilerplate code. The basis of the underlying infrastructure is the link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/model/function/FunctionCallback.java[FunctionCallback.java] interface and the companion link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/model/function/FunctionCallbackWrapper.java[FunctionCallbackWrapper.java] utility class to simplify the implementation and registration of Java callback functions. - == How it works -Suppose we want the AI model to respond with information that it does not have, for example the current temperature at a given location. +Suppose we want the AI model to respond with information that it does not have, for example, the current temperature at a given location. We can provide the AI model with metadata about our own functions that it can use to retrieve that information as it processes your prompt. -For example, if during the processing of a prompt, the AI Model determines that it needs additional information about the temperature in a given location, it will start a server side generated request/response interaction. -The AI Model invokes a client side function. -The AI Model provides method invocation details as JSON and it is the responsibility of the client to execute that function and return the response. +For example, if during the processing of a prompt, the AI Model determines that it needs additional information about the temperature in a given location, it will start a server-side generated request/response interaction. The AI Model invokes a client side function. +The AI Model provides method invocation details as JSON, and it is the responsibility of the client to execute that function and return the response. The model-client interaction is illustrated in the <> diagram. @@ -48,11 +46,11 @@ You can also reference multiple function bean names in your prompt. Let's create a chatbot that answer questions by calling our own function. To support the response of the chatbot, we will register our own function that takes a location and returns the current weather in that location. -When the response to the prompt to the model needs to answer a question such as `"What’s the weather like in Boston?"` the AI model will invoke the client providing the location value as an argument to be passed to the function. -This RPC-like data is passed as JSON. +When the model needs to answer a question such as `"What’s the weather like in Boston?"` the AI model will invoke the client providing +the location value as an argument to be passed to the function. This RPC-like data is passed as JSON. -Our function calls some SaaS based weather service API and returns the weather response back to the model to complete the conversation. -In this example we will use a simple implementation named `MockWeatherService` that hard codes the temperature for various locations. +Our function calls some SaaS based weather service API and returns the weather response back to the model to complete the conversation. +In this example, we will use a simple implementation named `MockWeatherService` that hard-codes the temperature for various locations. The following `MockWeatherService.java` represents the weather service API: @@ -74,16 +72,15 @@ public class MockWeatherService implements Function { With the link:../ollama-chat.html#_auto_configuration[OllamaChatModel Auto-Configuration] you have multiple ways to register custom functions as beans in the Spring context. -We start with describing the most POJO friendly options. +We start by describing the most POJO-friendly options. ==== Plain Java Functions -In this approach you define `@Beans` in your application context as you would any other Spring managed object. +In this approach, you define a `@Bean` in your application context as you would any other Spring managed object. -Internally, Spring AI `ChatModel` will create an instance of a `FunctionCallbackWrapper` wrapper that adds the logic for it being invoked via the AI model. +Internally, Spring AI `ChatModel` will create an instance of a `FunctionCallbackWrapper` that adds the logic for it being invoked via the AI model. The name of the `@Bean` is passed as a `ChatOption`. - [source,java] ---- @Configuration @@ -91,31 +88,30 @@ static class Config { @Bean @Description("Get the weather in location") // function description - public Function weatherFunction1() { + public Function currentWeather() { return new MockWeatherService(); } - ... + } ---- -The `@Description` annotation is optional and provides a function description (2) that helps the model understand when to call the function. It is an important property to set to help the AI model determine what client side function to invoke. +The `@Description` annotation is optional and provides a function description that helps the model understand when to call the function. It is an important property to set to help the AI model determine what client side function to invoke. -Another option to provide the description of the function is to use the `@JsonClassDescription` annotation on the `MockWeatherService.Request` to provide the function description: +Another option for providing the description of the function is to use the `@JsonClassDescription` annotation on the `MockWeatherService.Request`: [source,java] ---- - @Configuration static class Config { @Bean - public Function currentWeather3() { // (1) bean name as function name. + public Function currentWeather() { // bean name as function name return new MockWeatherService(); } - ... + } -@JsonClassDescription("Get the weather in location") // (2) function description +@JsonClassDescription("Get the weather in location") // // function description public record Request(String location, Unit unit) {} ---- @@ -123,7 +119,7 @@ It is a best practice to annotate the request object with information such that ==== FunctionCallback Wrapper -Another way to register a function is to create a `FunctionCallbackWrapper` wrapper like this: +Another way to register a function is to create a `FunctionCallbackWrapper` like this: [source,java] ---- @@ -138,14 +134,14 @@ static class Config { .withDescription("Get the weather in location") // (2) function description .build(); } - ... + } ---- It wraps the 3rd party `MockWeatherService` function and registers it as a `CurrentWeather` function with the `OllamaChatModel`. -It also provides a description (2) and an optional response converter (3) to convert the response into a text as expected by the model. +It also provides a description (2) and an optional response converter to convert the response into a text as expected by the model. -NOTE: By default, the response converter does a JSON serialization of the Response object. +NOTE: By default, the response converter performs a JSON serialization of the Response object. NOTE: The `FunctionCallbackWrapper` internally resolves the function call signature based on the `MockWeatherService.Request` class. @@ -159,15 +155,15 @@ OllamaChatModel chatModel = ... UserMessage userMessage = new UserMessage("What's the weather like in San Francisco, Tokyo, and Paris?"); -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage), - OllamaOptions.builder().withFunction("CurrentWeather").build())); // (1) Enable the function +ChatResponse response = chatModel.call(new Prompt(userMessage, + OllamaOptions.builder().withFunction("CurrentWeather").build())); // Enable the function logger.info("Response: {}", response); ---- -// NOTE: You can can have multiple functions registered in your `ChatModel` but only those enabled in the prompt request will be considered for the function calling. +// NOTE: You can have multiple functions registered in your `ChatModel` but only those enabled in the prompt request will be considered for the function calling. -Above user question will trigger 3 calls to `CurrentWeather` function (one for each city) and the final response will be something like this: +The above user question will trigger 3 calls to the `CurrentWeather` function (one for each city) and the final response will be something like this: ---- Here is the current weather for the requested cities: @@ -178,10 +174,9 @@ Here is the current weather for the requested cities: The link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/ollama/tool/FunctionCallbackWrapperIT.java[FunctionCallbackWrapperIT.java] test demo this approach. - === Register/Call Functions with Prompt Options -In addition to the auto-configuration you can register callback functions, dynamically, with your Prompt requests: +In addition to the auto-configuration, you can register callback functions, dynamically, with your `Prompt` requests: [source,java] ---- @@ -196,12 +191,12 @@ var promptOptions = OllamaOptions.builder() new MockWeatherService()))) // function code .build(); -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage), promptOptions)); +ChatResponse response = chatModel.call(new Prompt(userMessage, promptOptions)); ---- NOTE: The in-prompt registered functions are enabled by default for the duration of this request. -This approach allows to dynamically chose different functions to be called based on the user input. +This approach allows to choose dynamically different functions to be called based on the user input. The link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/ollama/tool/FunctionCallbackInPromptIT.java[FunctionCallbackInPromptIT.java] integration test provides a complete example of how to register a function with the `OllamaChatModel` and use it in a prompt request. @@ -209,7 +204,7 @@ The link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring === Spring AI Function Calling Flow [[spring-ai-function-calling-flow]] -The following diagram illustrates the flow of the OllamaChatModel Function Calling: +The following diagram illustrates the flow of the `OllamaChatModel` Function Calling: image:ollama-chatmodel-function-call.jpg[width=800, title="OllamaChatModel Function Calling Flow"] diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/openai-chat-functions.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/openai-chat-functions.adoc index 0be2d5eb138..e620a67e1e3 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/openai-chat-functions.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/functions/openai-chat-functions.adoc @@ -18,15 +18,14 @@ The basis of the underlying infrastructure is the link:https://github.com/spring // Additionally, the Auto-Configuration provides a way to auto-register any Function beans definition as function calling candidates in the `ChatModel`. - == How it works -Suppose we want the AI model to respond with information that it does not have, for example the current temperature at a given location. +Suppose we want the AI model to respond with information that it does not have, for example, the current temperature at a given location. We can provide the AI model with metadata about our own functions that it can use to retrieve that information as it processes your prompt. -For example, if during the processing of a prompt, the AI Model determines that it needs additional information about the temperature in a given location, it will start a server side generated request/response interaction. The AI Model invokes a client side function. -The AI Model provides method invocation details as JSON and it is the responsibility of the client to execute that function and return the response. +For example, if during the processing of a prompt, the AI Model determines that it needs additional information about the temperature in a given location, it will start a server-side generated request/response interaction. The AI Model invokes a client side function. +The AI Model provides method invocation details as JSON, and it is the responsibility of the client to execute that function and return the response. The model-client interaction is illustrated in the <> diagram. @@ -40,9 +39,9 @@ You can also reference multiple function bean names in your prompt. Let's create a chatbot that answer questions by calling our own function. To support the response of the chatbot, we will register our own function that takes a location and returns the current weather in that location. -When the response to the prompt to the model needs to answer a question such as `"What’s the weather like in Boston?"` the AI model will invoke the client providing the location value as an argument to be passed to the function. This RPC-like data is passed as JSON. +When the model needs to answer a question such as `"What’s the weather like in Boston?"` the AI model will invoke the client providing the location value as an argument to be passed to the function. This RPC-like data is passed as JSON. -Our function calls some SaaS based weather service API and returns the weather response back to the model to complete the conversation. In this example we will use a simple implementation named `MockWeatherService` that hard codes the temperature for various locations. +Our function calls some SaaS-based weather service API and returns the weather response back to the model to complete the conversation. In this example, we will use a simple implementation named `MockWeatherService` that hard-codes the temperature for various locations. The following `MockWeatherService.java` represents the weather service API: @@ -64,17 +63,15 @@ public class MockWeatherService implements Function { With the link:../openai-chat.html#_auto_configuration[OpenAiChatModel Auto-Configuration] you have multiple ways to register custom functions as beans in the Spring context. -We start with describing the most POJO friendly options. - +We start by describing the most POJO-friendly options. ==== Plain Java Functions -In this approach you define `@Beans` in your application context as you would any other Spring managed object. +In this approach, you define a `@Bean` in your application context as you would any other Spring managed object. -Internally, Spring AI `ChatModel` will create an instance of a `FunctionCallbackWrapper` wrapper that adds the logic for it being invoked via the AI model. +Internally, Spring AI `ChatModel` will create an instance of a `FunctionCallbackWrapper` that adds the logic for it being invoked via the AI model. The name of the `@Bean` is passed as a `ChatOption`. - [source,java] ---- @Configuration @@ -82,31 +79,30 @@ static class Config { @Bean @Description("Get the weather in location") // function description - public Function weatherFunction1() { + public Function currentWeather() { return new MockWeatherService(); } - ... + } ---- -The `@Description` annotation is optional and provides a function description (2) that helps the model understand when to call the function. It is an important property to set to help the AI model determine what client side function to invoke. +The `@Description` annotation is optional and provides a function description that helps the model understand when to call the function. It is an important property to set to help the AI model determine what client side function to invoke. -Another option to provide the description of the function is to use the `@JsonClassDescription` annotation on the `MockWeatherService.Request` to provide the function description: +Another option for providing the description of the function is to use the `@JsonClassDescription` annotation on the `MockWeatherService.Request`: [source,java] ---- - @Configuration static class Config { @Bean - public Function currentWeather3() { // (1) bean name as function name. + public Function currentWeather() { // bean name as function name return new MockWeatherService(); } - ... + } -@JsonClassDescription("Get the weather in location") // (2) function description +@JsonClassDescription("Get the weather in location") // // function description public record Request(String location, Unit unit) {} ---- @@ -114,10 +110,9 @@ It is a best practice to annotate the request object with information such that The link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/openai/tool/FunctionCallbackWithPlainFunctionBeanIT.java[FunctionCallbackWithPlainFunctionBeanIT.java] demonstrates this approach. - ==== FunctionCallback Wrapper -Another way to register a function is to create a `FunctionCallbackWrapper` wrapper like this: +Another way to register a function is to create a `FunctionCallbackWrapper` like this: [source,java] ---- @@ -132,14 +127,14 @@ static class Config { .withDescription("Get the weather in location") // (2) function description .build(); } - ... + } ---- It wraps the 3rd party `MockWeatherService` function and registers it as a `CurrentWeather` function with the `OpenAiChatModel`. -It also provides a description (2) and an optional response converter (3) to convert the response into a text as expected by the model. +It also provides a description (2) and an optional response converter to convert the response into a text as expected by the model. -NOTE: By default, the response converter does a JSON serialization of the Response object. +NOTE: By default, the response converter performs a JSON serialization of the Response object. NOTE: The `FunctionCallbackWrapper` internally resolves the function call signature based on the `MockWeatherService.Request` class. @@ -153,15 +148,15 @@ OpenAiChatModel chatModel = ... UserMessage userMessage = new UserMessage("What's the weather like in San Francisco, Tokyo, and Paris?"); -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage), - OpenAiChatOptions.builder().withFunction("CurrentWeather").build())); // (1) Enable the function +ChatResponse response = chatModel.call(new Prompt(userMessage, + OpenAiChatOptions.builder().withFunction("CurrentWeather").build())); // Enable the function logger.info("Response: {}", response); ---- -// NOTE: You can can have multiple functions registered in your `ChatModel` but only those enabled in the prompt request will be considered for the function calling. +// NOTE: You can have multiple functions registered in your `ChatModel` but only those enabled in the prompt request will be considered for the function calling. -Above user question will trigger 3 calls to `CurrentWeather` function (one for each city) and the final response will be something like this: +The above user question will trigger 3 calls to the `CurrentWeather` function (one for each city) and the final response will be something like this: ---- Here is the current weather for the requested cities: @@ -172,10 +167,9 @@ Here is the current weather for the requested cities: The link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/openai/tool/FunctionCallbackWrapperIT.java[FunctionCallbackWrapperIT.java] test demo this approach. - === Register/Call Functions with Prompt Options -In addition to the auto-configuration you can register callback functions, dynamically, with your Prompt requests: +In addition to the auto-configuration, you can register callback functions, dynamically, with your `Prompt` requests: [source,java] ---- @@ -190,12 +184,12 @@ var promptOptions = OpenAiChatOptions.builder() new MockWeatherService()))) // function code .build(); -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage), promptOptions)); +ChatResponse response = chatModel.call(new Prompt(userMessage, promptOptions)); ---- NOTE: The in-prompt registered functions are enabled by default for the duration of this request. -This approach allows to dynamically chose different functions to be called based on the user input. +This approach allows to choose dynamically different functions to be called based on the user input. The https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/openai/tool/FunctionCallbackInPromptIT.java[FunctionCallbackInPromptIT.java] integration test provides a complete example of how to register a function with the `OpenAiChatModel` and use it in a prompt request. // @@ -230,7 +224,7 @@ The https://github.com/spring-projects/spring-ai/blob/main/spring-ai-spring-boot === Spring AI Function Calling Flow [[spring-ai-function-calling-flow]] -The following diagram illustrates the flow of the OpenAiChatModel Function Calling: +The following diagram illustrates the flow of the `OpenAiChatModel` Function Calling: image:openai-chatclient-function-call.jpg[width=800, title="OpenAiChatModel Function Calling Flow"] diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/mistralai-chat.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/mistralai-chat.adoc index 4d246676ceb..c38248314af 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/mistralai-chat.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/mistralai-chat.adoc @@ -2,13 +2,13 @@ Spring AI supports the various AI language models from Mistral AI. You can interact with Mistral AI language models and create a multilingual conversational assistant based on Mistral models. -TIP: Mistral AI offers an OpenAI API compatible endpoint as well. -Check the xref:_openai_api_compatibility[OpenAI API compatibility] section to learn how to use the xref:api/chat/openai-chat.adoc[Spring AI OpenAI] to talk to a Mistral endponit. +TIP: Mistral AI offers an OpenAI API-compatible endpoint as well. +Check the xref:_openai_api_compatibility[OpenAI API compatibility] section to learn how to use the xref:api/chat/openai-chat.adoc[Spring AI OpenAI] integration to talk to a Mistral endpoint. == Prerequisites -You will need to create an API with MistralAI to access Mistral AI language models. -Create an account at https://auth.mistral.ai/ui/registration[MistralAI registration page] and generate the token on the https://console.mistral.ai/api-keys/[API Keys page]. +You will need to create an API with Mistral AI to access Mistral AI language models. +Create an account at https://auth.mistral.ai/ui/registration[Mistral AI registration page] and generate the token on the https://console.mistral.ai/api-keys/[API Keys page]. The Spring AI project defines a configuration property named `spring.ai.mistralai.api-key` that you should set to the value of the `API Key` obtained from console.mistral.ai. Exporting an environment variable is one way to set that configuration property: @@ -24,11 +24,9 @@ Refer to the xref:getting-started.adoc#repositories[Repositories] section to add To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build system. - - == Auto-configuration -Spring AI provides Spring Boot auto-configuration for the MistralAI Chat Client. +Spring AI provides Spring Boot auto-configuration for the Mistral AI Chat Client. To enable it add the following dependency to your project's Maven `pom.xml` file: [source, xml] @@ -83,34 +81,34 @@ The prefix `spring.ai.mistralai` is used as the property prefix that lets you co ==== Configuration Properties -The prefix `spring.ai.mistralai.chat` is the property prefix that lets you configure the chat model implementation for MistralAI. +The prefix `spring.ai.mistralai.chat` is the property prefix that lets you configure the chat model implementation for Mistral AI. [cols="3,5,1"] |==== | Property | Description | Default -| spring.ai.mistralai.chat.enabled | Enable MistralAI chat model. | true -| spring.ai.mistralai.chat.base-url | Optional overrides the spring.ai.mistralai.base-url to provide chat specific url | - -| spring.ai.mistralai.chat.api-key | Optional overrides the spring.ai.mistralai.api-key to provide chat specific api-key | - -| spring.ai.mistralai.chat.options.model | This is the MistralAI Chat model to use | `open-mistral-7b`, `open-mixtral-8x7b`, `mistral-small-latest`, `mistral-medium-latest`, `mistral-large-latest` -| spring.ai.mistralai.chat.options.temperature | The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and top_p for the same completions request as the interaction of these two settings is difficult to predict. | 0.8 +| spring.ai.mistralai.chat.enabled | Enable Mistral AI chat model. | true +| spring.ai.mistralai.chat.base-url | Optional override for the `spring.ai.mistralai.base-url` property to provide chat-specific URL. | - +| spring.ai.mistralai.chat.api-key | Optional override for the `spring.ai.mistralai.api-key` to provide chat-specific API Key. | - +| spring.ai.mistralai.chat.options.model | This is the Mistral AI Chat model to use | `open-mistral-7b`, `open-mixtral-8x7b`, `open-mixtral-8x22b`, `mistral-small-latest`, `mistral-large-latest` +| spring.ai.mistralai.chat.options.temperature | The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify `temperature` and `top_p` for the same completions request as the interaction of these two settings is difficult to predict. | 0.8 | spring.ai.mistralai.chat.options.maxTokens | The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. | - | spring.ai.mistralai.chat.options.safePrompt | Indicates whether to inject a security prompt before all conversations. | false | spring.ai.mistralai.chat.options.randomSeed | This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. | - | spring.ai.mistralai.chat.options.stop | Stop generation if this token is detected. Or if one of these tokens is detected when providing an array. | - -| spring.ai.mistralai.chat.options.topP | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. | - +| spring.ai.mistralai.chat.options.topP | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both. | - | spring.ai.mistralai.chat.options.responseFormat | An object specifying the format that the model must output. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON.| - | spring.ai.mistralai.chat.options.tools | A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. | - -| spring.ai.mistralai.chat.options.toolChoice | Controls which (if any) function is called by the model. none means the model will not call a function and instead generates a message. auto means the model can pick between generating a message or calling a function. Specifying a particular function via {"type: "function", "function": {"name": "my_function"}} forces the model to call that function. none is the default when no functions are present. auto is the default if functions are present. | - +| spring.ai.mistralai.chat.options.toolChoice | Controls which (if any) function is called by the model. `none` means the model will not call a function and instead generates a message. `auto` means the model can pick between generating a message or calling a function. Specifying a particular function via `{"type: "function", "function": {"name": "my_function"}}` forces the model to call that function. `none` is the default when no functions are present. `auto` is the default if functions are present. | - | spring.ai.mistralai.chat.options.functions | List of functions, identified by their names, to enable for function calling in a single prompt requests. Functions with those names must exist in the functionCallbacks registry. | - -| spring.ai.mistralai.chat.options.functionCallbacks | MistralAI Tool Function Callbacks to register with the ChatModel. | - +| spring.ai.mistralai.chat.options.functionCallbacks | Mistral AI Tool Function Callbacks to register with the ChatModel. | - |==== NOTE: You can override the common `spring.ai.mistralai.base-url` and `spring.ai.mistralai.api-key` for the `ChatModel` and `EmbeddingModel` implementations. -The `spring.ai.mistralai.chat.base-url` and `spring.ai.mistralai.chat.api-key` properties if set take precedence over the common properties. -This is useful if you want to use different MistralAI accounts for different models and different model endpoints. +The `spring.ai.mistralai.chat.base-url` and `spring.ai.mistralai.chat.api-key` properties, if set, take precedence over the common properties. +This is useful if you want to use different Mistral AI accounts for different models and different model endpoints. -TIP: All properties prefixed with `spring.ai.mistralai.chat.options` can be overridden at runtime by adding a request specific <> to the `Prompt` call. +TIP: All properties prefixed with `spring.ai.mistralai.chat.options` can be overridden at runtime by adding request-specific <> to the `Prompt` call. == Runtime Options [[chat-options]] @@ -118,8 +116,8 @@ The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai On start-up, the default options can be configured with the `MistralAiChatModel(api, options)` constructor or the `spring.ai.mistralai.chat.options.*` properties. -At run-time you can override the default options by adding new, request specific, options to the `Prompt` call. -For example to override the default model and temperature for a specific request: +At run-time, you can override the default options by adding new, request-specific options to the `Prompt` call. +For example, to override the default model and temperature for a specific request: [source,java] ---- @@ -133,40 +131,38 @@ ChatResponse response = chatModel.call( )); ---- -TIP: In addition to the model specific link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/main/java/org/springframework/ai/mistralai/MistralAiChatOptions.java[MistralAiChatOptions] you can use a portable https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptions.java[ChatOptions] instance, created with the https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptionsBuilder.java[ChatOptionsBuilder#builder()]. +TIP: In addition to the model specific link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/main/java/org/springframework/ai/mistralai/MistralAiChatOptions.java[MistralAiChatOptions] you can use a portable https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptions.java[ChatOptions] instance, created with https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptionsBuilder.java[ChatOptionsBuilder#builder()]. == Function Calling -You can register custom Java functions with the MistralAiChatModel and have the Mistral AI model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. +You can register custom Java functions with the `MistralAiChatModel` and have the Mistral AI model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. This is a powerful technique to connect the LLM capabilities with external tools and APIs. Read more about xref:api/chat/functions/mistralai-chat-functions.adoc[Mistral AI Function Calling]. - == OpenAI API Compatibility -Mistral is OpenAI API compatible and you can use the xref:api/chat/openai-chat.adoc[Spring AI OpenAI] client to talk to Mistrial. -For this you need to set the OpenAI base-url: `spring.ai.openai.chat.base-url=https://api.mistral.ai`, select a Mistral models: `spring.ai.openai.chat.options.model=mistral-small-latest` and set the Api key: `spring.ai.openai.chat.api-key= generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) { return Map.of("generation", chatModel.call(message)); } @@ -195,7 +191,7 @@ public class ChatController { == Manual Configuration -The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/main/java/org/springframework/ai/mistralai/MistralAiChatModel.java[MistralAiChatModel] implements the `ChatModel` and `StreamingChatModel` and uses the <> to connect to the MistralAI service. +The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/main/java/org/springframework/ai/mistralai/MistralAiChatModel.java[MistralAiChatModel] implements the `ChatModel` and `StreamingChatModel` and uses the <> to connect to the Mistral AI service. Add the `spring-ai-mistral-ai` dependency to your project's Maven `pom.xml` file: @@ -239,18 +235,17 @@ Flux response = chatModel.stream( ---- The `MistralAiChatOptions` provides the configuration information for the chat requests. -The `MistralAiChatOptions.Builder` is fluent options builder. +The `MistralAiChatOptions.Builder` is a fluent options-builder. === Low-level MistralAiApi Client [[low-level-api]] The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/main/java/org/springframework/ai/mistralai/api/MistralAiApi.java[MistralAiApi] provides is lightweight Java client for link:https://docs.mistral.ai/api/[Mistral AI API]. -Here is a simple snippet how to use the api programmatically: +Here is a simple snippet showing how to use the API programmatically: [source,java] ---- -MistralAiApi mistralAiApi = - new MistralAiApi(System.getenv("MISTRAL_AI_API_KEY")); +MistralAiApi mistralAiApi = new MistralAiApi(System.getenv("MISTRAL_AI_API_KEY")); ChatCompletionMessage chatCompletionMessage = new ChatCompletionMessage("Hello world", Role.USER); @@ -267,7 +262,8 @@ Flux streamResponse = mistralAiApi.chatCompletionStream( Follow the https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/main/java/org/springframework/ai/mistralai/api/MistralAiApi.java[MistralAiApi.java]'s JavaDoc for further information. ==== MistralAiApi Samples -* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/test/java/org/springframework/ai/mistralai/api/MistralAiApiIT.java[MistralAiApiIT.java] test provides some general examples how to use the lightweight library. -* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/test/java/org/springframework/ai/mistralai/api/tool/PaymentStatusFunctionCallingIT.java[PaymentStatusFunctionCallingIT.java] test shows how to use the low-level API to call tool functions. -Based on the link:https://docs.mistral.ai/guides/function-calling/[MistralAI Function Calling] tutorial. +* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/test/java/org/springframework/ai/mistralai/api/MistralAiApiIT.java[MistralAiApiIT.java] tests provide some general examples of how to use the lightweight library. + +* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-mistral-ai/src/test/java/org/springframework/ai/mistralai/api/tool/PaymentStatusFunctionCallingIT.java[PaymentStatusFunctionCallingIT.java] tests show how to use the low-level API to call tool functions. +Based on the link:https://docs.mistral.ai/guides/function-calling/[Mistral AI Function Calling] tutorial. diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/ollama-chat.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/ollama-chat.adoc index d307c7a2694..e52c8469f27 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/ollama-chat.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/ollama-chat.adoc @@ -1,18 +1,17 @@ = Ollama Chat With https://ollama.ai/[Ollama] you can run various Large Language Models (LLMs) locally and generate text from them. -Spring AI supports the Ollama text generation with `OllamaChatModel`. - +Spring AI supports the Ollama text generation capabilities with the `OllamaChatModel` API. TIP: Ollama offers an OpenAI API compatible endpoint as well. -Check the xref:_openai_api_compatibility[OpenAI API compatibility] section to learn how to use the xref:api/chat/openai-chat.adoc[Spring AI OpenAI] to talk to an Ollama server. +Check the xref:_openai_api_compatibility[OpenAI API compatibility] section to learn how to use the xref:api/chat/openai-chat.adoc[Spring AI OpenAI] project to talk to an Ollama server. == Prerequisites You first need to run Ollama on your local machine. Refer to the official Ollama project link:https://github.com/ollama/ollama[README] to get started running models on your local machine. -NOTE: installing `ollama run llama3` will download a 4.7GB model artifact. +NOTE: Running `ollama pull mistral` will download a 4.1GB model artifact. === Add Repositories and BOM @@ -21,10 +20,9 @@ Refer to the xref:getting-started.adoc#repositories[Repositories] section to add To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build system. - == Auto-configuration -Spring AI provides Spring Boot auto-configuration for the Ollama Chat Client. +Spring AI provides Spring Boot auto-configuration for the Ollama chat integration. To enable it add the following dependency to your project's Maven `pom.xml` file: [source,xml] @@ -48,7 +46,7 @@ TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Man === Chat Properties -The prefix `spring.ai.ollama` is the property prefix to configure the connection to Ollama +The prefix `spring.ai.ollama` is the property prefix to configure the connection to Ollama. [cols="3,6,1"] |==== @@ -68,11 +66,11 @@ Here are the advanced request parameter for the Ollama chat model: | spring.ai.ollama.chat.enabled | Enable Ollama chat model. | true | spring.ai.ollama.chat.options.model | The name of the https://github.com/ollama/ollama?tab=readme-ov-file#model-library[supported model] to use. | mistral -| spring.ai.ollama.chat.options.format | The format to return a response in. Currently the only accepted value is `json` | - +| spring.ai.ollama.chat.options.format | The format to return a response in. Currently, the only accepted value is `json` | - | spring.ai.ollama.chat.options.keep_alive | Controls how long the model will stay loaded into memory following the request | 5m |==== -The remaining `options` properties are based on the link:https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values[Ollama Valid Parameters and Values] and link:https://github.com/ollama/ollama/blob/main/api/types.go[Ollama Types]. The default values are based on: link:https://github.com/ollama/ollama/blob/b538dc3858014f94b099730a592751a5454cab0a/api/types.go#L364[Ollama type defaults]. +The remaining `options` properties are based on the link:https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values[Ollama Valid Parameters and Values] and link:https://github.com/ollama/ollama/blob/main/api/types.go[Ollama Types]. The default values are based on the link:https://github.com/ollama/ollama/blob/b538dc3858014f94b099730a592751a5454cab0a/api/types.go#L364[Ollama Types Defaults]. [cols="3,6,1"] |==== @@ -109,16 +107,16 @@ The remaining `options` properties are based on the link:https://github.com/olla | spring.ai.ollama.chat.options.functions | List of functions, identified by their names, to enable for function calling in a single prompt requests. Functions with those names must exist in the functionCallbacks registry. | - |==== -TIP: All properties prefixed with `spring.ai.ollama.chat.options` can be overridden at runtime by adding a request specific <> to the `Prompt` call. +TIP: All properties prefixed with `spring.ai.ollama.chat.options` can be overridden at runtime by adding request-specific <> to the `Prompt` call. == Runtime Options [[chat-options]] -The https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaOptions.java[OllamaOptions.java] provides model configurations, such as the model to use, the temperature, etc. +The https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaOptions.java[OllamaOptions.java] class provides model configurations, such as the model to use, the temperature, etc. On start-up, the default options can be configured with the `OllamaChatModel(api, options)` constructor or the `spring.ai.ollama.chat.options.*` properties. -At run-time you can override the default options by adding new, request specific, options to the `Prompt` call. -For example to override the default model and temperature for a specific request: +At run-time, you can override the default options by adding new, request-specific options to the `Prompt` call. +For example, to override the default model and temperature for a specific request: [source,java] ---- @@ -128,55 +126,52 @@ ChatResponse response = chatModel.call( OllamaOptions.builder() .withModel(OllamaModel.LLAMA3_1) .withTemperature(0.4) - .build(); + .build() )); ---- -TIP: In addition to the model specific link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaOptions.java[OllamaOptions] you can use a portable https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptions.java[ChatOptions] instance, created with the https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptionsBuilder.java[ChatOptionsBuilder#builder()]. - +TIP: In addition to the model specific link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/api/OllamaOptions.java[OllamaOptions] you can use a portable https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptions.java[ChatOptions] instance, created with https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptionsBuilder.java[ChatOptionsBuilder#builder()]. == Function Calling -You can register custom Java functions with the OllamaChatModel and have the Ollama model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. +You can register custom Java functions with the `OllamaChatModel` and have the Ollama model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. This is a powerful technique to connect the LLM capabilities with external tools and APIs. Read more about xref:api/chat/functions/ollama-chat-functions.adoc[Ollama Function Calling]. -TIP: You need Ollama 0.2.8 or newer. +TIP: You need Ollama 0.2.8 or newer to use the functional calling capabilities. -NOTE: Currently, the Ollama API (0.2.8) does not support function calling in streaming mode. +NOTE: Currently, the Ollama API (0.3.8) does not support function calling in streaming mode. == Multimodal Multimodality refers to a model's ability to simultaneously understand and process information from various sources, including text, images, audio, and other data formats. -Presently, the https://ollama.com/library/llava[LLaVa] and https://ollama.com/library/bakllava[bakllava] Ollama models offer multimodal support. +Some of the models available in Ollama with multimodality support are https://ollama.com/library/llava[LLaVa] and https://ollama.com/library/bakllava[bakllava] (see the link:https://ollama.com/search?c=vision[full list]). For further details, refer to the link:https://llava-vl.github.io/[LLaVA: Large Language and Vision Assistant]. The Ollama link:https://github.com/ollama/ollama/blob/main/docs/api.md#parameters-1[Message API] provides an "images" parameter to incorporate a list of base64-encoded images with the message. Spring AI’s link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/messages/Message.java[Message] interface facilitates multimodal AI models by introducing the link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/messages/Media.java[Media] type. -This type encompasses data and details regarding media attachments in messages, utilizing Spring’s `org.springframework.util.MimeType` and a `java.lang.Object` for the raw media data. +This type encompasses data and details regarding media attachments in messages, utilizing Spring’s `org.springframework.util.MimeType` and a `org.springframework.core.io.Resource` for the raw media data. Below is a straightforward code example excerpted from link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-ollama/src/test/java/org/springframework/ai/ollama/OllamaChatModelMultimodalIT.java[OllamaChatModelMultimodalIT.java], illustrating the fusion of user text with an image. [source,java] ---- -byte[] imageData = new ClassPathResource("/multimodal.test.png").getContentAsByteArray(); +var imageResource = new ClassPathResource("/multimodal.test.png"); var userMessage = new UserMessage("Explain what do you see on this picture?", - List.of(new Media(MimeTypeUtils.IMAGE_PNG, imageData))); - -ChatResponse response = chatModel.call( - new Prompt(List.of(userMessage), OllamaOptions.create().withModel("llava"))); + new Media(MimeTypeUtils.IMAGE_PNG, imageResource)); -logger.info(response.getResult().getOutput().getContent()); +ChatResponse response = chatModel.call(new Prompt(userMessage, + OllamaOptions.builder().withModel(OllamaModel.LLAVA)).build()); ---- -It takes as an input the `multimodal.test.png` image: +The example shows a model taking as an input the `multimodal.test.png` image: image::multimodal.test.png[Multimodal Test Image, 200, 200, align="left"] -along with the text message "Explain what do you see on this picture?", and generates a response like this: +along with the text message "Explain what do you see on this picture?", and generating a response like this: ---- The image shows a small metal basket filled with ripe bananas and red apples. The basket is placed on a surface, @@ -188,14 +183,13 @@ where fruits are being displayed, possibly for convenience or aesthetic purposes == OpenAI API Compatibility -Ollama is OpenAI API compatible and you can use the xref:api/chat/openai-chat.adoc[Spring AI OpenAI] client to talk to Ollama and use tools. -For this you need to set the OpenAI base-url: `spring.ai.openai.chat.base-url=http://localhost:11434` and select one of the provided Ollama models: `spring.ai.openai.chat.options.model=mistral`. +Ollama is OpenAI API-compatible and you can use the xref:api/chat/openai-chat.adoc[Spring AI OpenAI] client to talk to Ollama and use tools. +For this, you need to configure the OpenAI base URL to your Ollama instance: `spring.ai.openai.chat.base-url=http://localhost:11434` and select one of the provided Ollama models: `spring.ai.openai.chat.options.model=mistral`. image::spring-ai-ollama-over-openai.jpg[Ollama OpenAI API compatibility, 800, 600, align="center"] Check the link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/test/java/org/springframework/ai/openai/chat/OllamaWithOpenAiChatModelIT.java[OllamaWithOpenAiChatModelIT.java] tests for examples of using Ollama over Spring AI OpenAI. - == Sample Controller https://start.spring.io/[Create] a new Spring Boot project and add the `spring-ai-ollama-spring-boot-starter` to your pom (or gradle) dependencies. @@ -209,10 +203,10 @@ spring.ai.ollama.chat.options.model=mistral spring.ai.ollama.chat.options.temperature=0.7 ---- -TIP: replace the `base-url` with your Ollama server URL. +TIP: Replace the `base-url` with your Ollama server URL. -This will create a `OllamaChatModel` implementation that you can inject into your class. -Here is an example of a simple `@Controller` class that uses the chat model for text generations. +This will create an `OllamaChatModel` implementation that you can inject into your classes. +Here is an example of a simple `@RestController` class that uses the chat model for text generations. [source,java] ---- @@ -227,7 +221,7 @@ public class ChatController { } @GetMapping("/ai/generate") - public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) { + public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) { return Map.of("generation", chatModel.call(message)); } @@ -245,7 +239,7 @@ public class ChatController { If you don't want to use the Spring Boot auto-configuration, you can manually configure the `OllamaChatModel` in your application. The https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-ollama/src/main/java/org/springframework/ai/ollama/OllamaChatModel.java[OllamaChatModel] implements the `ChatModel` and `StreamingChatModel` and uses the <> to connect to the Ollama service. -To use it add the `spring-ai-ollama` dependency to your project's Maven `pom.xml` file: +To use it, add the `spring-ai-ollama` dependency to your project's Maven `pom.xml` file: [source,xml] ---- @@ -269,7 +263,7 @@ TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Man TIP: The `spring-ai-ollama` dependency provides access also to the `OllamaEmbeddingModel`. For more information about the `OllamaEmbeddingModel` refer to the link:../embeddings/ollama-embeddings.html[Ollama Embedding Model] section. -Next, create an `OllamaChatModel` instance and use it to text generations requests: +Next, create an `OllamaChatModel` instance and use it to send requests for text generation: [source,java] ---- @@ -298,14 +292,13 @@ The following class diagram illustrates the `OllamaApi` chat interfaces and buil image::ollama-chat-completion-api.jpg[OllamaApi Chat Completion API Diagram, 800, 600] -Here is a simple snippet showing how to use the API programmatically: +NOTE: The `OllamaApi` is a low-level API and is not recommended for direct use. Use the `OllamaChatModel` instead. -NOTE: The `OllamaApi` is low level api and is not recommended for direct use. Use the `OllamaChatModel` instead. +Here is a simple snippet showing how to use the API programmatically: [source,java] ---- -OllamaApi ollamaApi = - new OllamaApi("YOUR_HOST:YOUR_PORT"); +OllamaApi ollamaApi = new OllamaApi("YOUR_HOST:YOUR_PORT"); // Sync request var request = ChatRequest.builder("orca-mini") diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/openai-chat.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/openai-chat.adoc index cf13a6a358b..dcf2c8fcc9e 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/openai-chat.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/openai-chat.adoc @@ -1,6 +1,6 @@ = OpenAI Chat -Spring AI supports ChatGPT, the AI language model by OpenAI. ChatGPT has been instrumental in sparking interest in AI-driven text generation, thanks to its creation of industry-leading text generation models and embeddings. +Spring AI supports the various AI language models from OpenAI, the company behind ChatGPT, which has been instrumental in sparking interest in AI-driven text generation thanks to its creation of industry-leading text generation models and embeddings. == Prerequisites @@ -21,8 +21,6 @@ Refer to the xref:getting-started.adoc#repositories[Repositories] section to add To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build system. - - == Auto-configuration Spring AI provides Spring Boot auto-configuration for the OpenAI Chat Client. @@ -76,11 +74,11 @@ The prefix `spring.ai.openai` is used as the property prefix that lets you conne | spring.ai.openai.base-url | The URL to connect to | https://api.openai.com | spring.ai.openai.api-key | The API Key | - -| spring.ai.openai.organization-id | Optionally you can specify which organization used for an API request. | - -| spring.ai.openai.project-id | Optionally, you can specify which project is used for an API request. | - +| spring.ai.openai.organization-id | Optionally, you can specify which organization to use for an API request. | - +| spring.ai.openai.project-id | Optionally, you can specify which project to use for an API request. | - |==== -TIP: For users that belong to multiple organizations (or are accessing their projects through their legacy user API key), optionally, you can specify which organization and project is used for an API request. +TIP: For users that belong to multiple organizations (or are accessing their projects through their legacy user API key), you can optionally specify which organization and project is used for an API request. Usage from these API requests will count as usage for the specified organization and project. ==== Configuration Properties @@ -92,17 +90,17 @@ The prefix `spring.ai.openai.chat` is the property prefix that lets you configur | Property | Description | Default | spring.ai.openai.chat.enabled | Enable OpenAI chat model. | true -| spring.ai.openai.chat.base-url | Optional overrides the spring.ai.openai.base-url to provide chat specific url | - -| spring.ai.openai.chat.completions-path | The path to append to the base-url | `/v1/chat/completions` -| spring.ai.openai.chat.api-key | Optional overrides the spring.ai.openai.api-key to provide chat specific api-key | - -| spring.ai.openai.chat.organization-id | Optionally you can specify which organization used for an API request. | - -| spring.ai.openai.chat.project-id | Optionally, you can specify which project is used for an API request. | - -| spring.ai.openai.chat.options.model | Name of the the OpenAI Chat model to use. You can select between models such as: `gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo`, `gpt-3.5-turbo` ... See the https://platform.openai.com/docs/models[models] page for more information. | `gpt-4o` -| spring.ai.openai.chat.options.temperature | The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and top_p for the same completions request as the interaction of these two settings is difficult to predict. | 0.8 +| spring.ai.openai.chat.base-url | Optional override for the `spring.ai.openai.base-url` property to provide a chat-specific URL. | - +| spring.ai.openai.chat.completions-path | The path to append to the base URL. | `/v1/chat/completions` +| spring.ai.openai.chat.api-key | Optional override for the `spring.ai.openai.api-key` to provide a chat-specific API Key. | - +| spring.ai.openai.chat.organization-id | Optionally, you can specify which organization to use for an API request. | - +| spring.ai.openai.chat.project-id | Optionally, you can specify which project to use for an API request. | - +| spring.ai.openai.chat.options.model | Name of the OpenAI chat model to use. You can select between models such as: `gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo`, `gpt-3.5-turbo`, and more. See the https://platform.openai.com/docs/models[models] page for more information. | `gpt-4o` +| spring.ai.openai.chat.options.temperature | The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify `temperature` and `top_p` for the same completions request as the interaction of these two settings is difficult to predict. | 0.8 | spring.ai.openai.chat.options.frequencyPenalty | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | 0.0f | spring.ai.openai.chat.options.logitBias | Modify the likelihood of specified tokens appearing in the completion. | - | spring.ai.openai.chat.options.maxTokens | The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. | - -| spring.ai.openai.chat.options.n | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. | 1 +| spring.ai.openai.chat.options.n | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep `n` as 1 to minimize costs. | 1 | spring.ai.openai.chat.options.presencePenalty | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | - | spring.ai.openai.chat.options.responseFormat.type | Compatible with `GPT-4o`, `GPT-4o mini`, `GPT-4 Turbo` and all `GPT-3.5 Turbo` models newer than `gpt-3.5-turbo-1106`. The `JSON_OBJECT` type enables JSON mode, which guarantees the message the model generates is valid JSON. The `JSON_SCHEMA` type enables link:https://platform.openai.com/docs/guides/structured-outputs[Structured Outputs] which guarantees the model will match your supplied JSON schema. The JSON_SCHEMA type requires setting the `responseFormat.schema` property as well. | - @@ -111,30 +109,30 @@ The `JSON_SCHEMA` type enables link:https://platform.openai.com/docs/guides/stru | spring.ai.openai.chat.options.responseFormat.strict | Response format JSON schema adherence strictness. Applicable only for `responseFormat.type=JSON_SCHEMA` | - | spring.ai.openai.chat.options.seed | This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. | - | spring.ai.openai.chat.options.stop | Up to 4 sequences where the API will stop generating further tokens. | - -| spring.ai.openai.chat.options.topP | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. | - +| spring.ai.openai.chat.options.topP | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with `top_p` probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both. | - | spring.ai.openai.chat.options.tools | A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. | - -| spring.ai.openai.chat.options.toolChoice | Controls which (if any) function is called by the model. none means the model will not call a function and instead generates a message. auto means the model can pick between generating a message or calling a function. Specifying a particular function via {"type: "function", "function": {"name": "my_function"}} forces the model to call that function. none is the default when no functions are present. auto is the default if functions are present. | - +| spring.ai.openai.chat.options.toolChoice | Controls which (if any) function is called by the model. `none` means the model will not call a function and instead generates a message. `auto` means the model can pick between generating a message or calling a function. Specifying a particular function via `{"type: "function", "function": {"name": "my_function"}}` forces the model to call that function. `none` is the default when no functions are present. `auto` is the default if functions are present. | - | spring.ai.openai.chat.options.user | A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. | - -| spring.ai.openai.chat.options.functions | List of functions, identified by their names, to enable for function calling in a single prompt requests. Functions with those names must exist in the functionCallbacks registry. | - +| spring.ai.openai.chat.options.functions | List of functions, identified by their names, to enable for function calling in a single prompt requests. Functions with those names must exist in the `functionCallbacks` registry. | - | spring.ai.openai.chat.options.stream-usage | (For streaming only) Set to add an additional chunk with token usage statistics for the entire request. The `choices` field for this chunk is an empty array and all other chunks will also include a usage field, but with a null value. | false | spring.ai.openai.chat.options.parallel-tool-calls | Whether to enable link:https://platform.openai.com/docs/guides/function-calling/parallel-function-calling[parallel function calling] during tool use. | true -| spring.ai.openai.chat.options.http-headers | Optional HTTP headers to be added to the chat completion request. To override the api-key you need to use a `Authorization` header key and you have to prefix the key value with the `Bearer ` prefix. | - +| spring.ai.openai.chat.options.http-headers | Optional HTTP headers to be added to the chat completion request. To override the `api-key` you need to use an `Authorization` header key, and you have to prefix the key value with the `Bearer ` prefix. | - |==== NOTE: You can override the common `spring.ai.openai.base-url` and `spring.ai.openai.api-key` for the `ChatModel` and `EmbeddingModel` implementations. -The `spring.ai.openai.chat.base-url` and `spring.ai.openai.chat.api-key` properties if set take precedence over the common properties. +The `spring.ai.openai.chat.base-url` and `spring.ai.openai.chat.api-key` properties, if set, take precedence over the common properties. This is useful if you want to use different OpenAI accounts for different models and different model endpoints. -TIP: All properties prefixed with `spring.ai.openai.chat.options` can be overridden at runtime by adding a request specific <> to the `Prompt` call. +TIP: All properties prefixed with `spring.ai.openai.chat.options` can be overridden at runtime by adding request-specific <> to the `Prompt` call. == Runtime Options [[chat-options]] -The https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiChatOptions.java[OpenAiChatOptions.java] provides model configurations, such as the model to use, the temperature, the frequency penalty, etc. +The https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiChatOptions.java[OpenAiChatOptions.java] class provides model configurations such as the model to use, the temperature, the frequency penalty, etc. On start-up, the default options can be configured with the `OpenAiChatModel(api, options)` constructor or the `spring.ai.openai.chat.options.*` properties. -At run-time you can override the default options by adding new, request specific, options to the `Prompt` call. -For example to override the default model and temperature for a specific request: +At run-time, you can override the default options by adding new, request-specific options to the `Prompt` call. +For example, to override the default model and temperature for a specific request: [source,java] ---- @@ -148,58 +146,58 @@ ChatResponse response = chatModel.call( )); ---- -TIP: In addition to the model specific https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiChatOptions.java[OpenAiChatOptions] you can use a portable https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptions.java[ChatOptions] instance, created with the https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptionsBuilder.java[ChatOptionsBuilder#builder()]. +TIP: In addition to the model specific https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/OpenAiChatOptions.java[OpenAiChatOptions] you can use a portable https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptions.java[ChatOptions] instance, created with https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/prompt/ChatOptionsBuilder.java[ChatOptionsBuilder#builder()]. == Function Calling -You can register custom Java functions with the OpenAiChatModel and have the OpenAI model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. +You can register custom Java functions with the `OpenAiChatModel` and have the OpenAI model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. This is a powerful technique to connect the LLM capabilities with external tools and APIs. Read more about xref:api/chat/functions/openai-chat-functions.adoc[OpenAI Function Calling]. == Multimodal Multimodality refers to a model's ability to simultaneously understand and process information from various sources, including text, images, audio, and other data formats. -Presently, the OpenAI `gpt-4o` and `gpt-4o-mini` models offers multimodal support. +OpenAI models that offer multimodal support include `gpt-4`, `gpt-4o`, and `gpt-4o-mini`. Refer to the link:https://platform.openai.com/docs/guides/vision[Vision] guide for more information. The OpenAI link:https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages[User Message API] can incorporate a list of base64-encoded images or image urls with the message. Spring AI’s link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/messages/Message.java[Message] interface facilitates multimodal AI models by introducing the link:https://github.com/spring-projects/spring-ai/blob/main/spring-ai-core/src/main/java/org/springframework/ai/chat/messages/Media.java[Media] type. -This type encompasses data and details regarding media attachments in messages, utilizing Spring’s `org.springframework.util.MimeType` and a `java.lang.Object` for the raw media data. +This type encompasses data and details regarding media attachments in messages, utilizing Spring’s `org.springframework.util.MimeType` and a `org.springframework.core.io.Resource` for the raw media data. -Below is a code example excerpted from link:https://github.com/spring-projects/spring-ai/blob/c9a3e66f90187ce7eae7eb78c462ec622685de6c/models/spring-ai-openai/src/test/java/org/springframework/ai/openai/chat/OpenAiChatModelIT.java#L293[OpenAiChatModelIT.java], illustrating the fusion of user text with an image using the the `GPT_4_O` model. +Below is a code example excerpted from link:https://github.com/spring-projects/spring-ai/blob/c9a3e66f90187ce7eae7eb78c462ec622685de6c/models/spring-ai-openai/src/test/java/org/springframework/ai/openai/chat/OpenAiChatModelIT.java#L293[OpenAiChatModelIT.java], illustrating the fusion of user text with an image using the `gpt-4o` model. [source,java] ---- -byte[] imageData = new ClassPathResource("/multimodal.test.png").getContentAsByteArray(); +var imageResource = new ClassPathResource("/multimodal.test.png"); var userMessage = new UserMessage("Explain what do you see on this picture?", - List.of(new Media(MimeTypeUtils.IMAGE_PNG, imageData))); + new Media(MimeTypeUtils.IMAGE_PNG, imageResource)); -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage), +ChatResponse response = chatModel.call(new Prompt(userMessage, OpenAiChatOptions.builder().withModel(OpenAiApi.ChatModel.GPT_4_O.getValue()).build())); ---- TIP: GPT_4_VISION_PREVIEW will continue to be available only to existing users of this model starting June 17, 2024. If you are not an existing user, please use the GPT_4_O or GPT_4_TURBO models. More details https://platform.openai.com/docs/deprecations/2024-06-06-gpt-4-32k-and-vision-preview-models[here] -or the image URL equivalent using the `GPT_4_O` model : +or the image URL equivalent using the `gpt-4o` model: [source,java] ---- var userMessage = new UserMessage("Explain what do you see on this picture?", - List.of(new Media(MimeTypeUtils.IMAGE_PNG, - "https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/_images/multimodal.test.png"))); + new Media(MimeTypeUtils.IMAGE_PNG, + "https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/_images/multimodal.test.png")); -ChatResponse response = chatModel.call(new Prompt(List.of(userMessage), +ChatResponse response = chatModel.call(new Prompt(userMessage, OpenAiChatOptions.builder().withModel(OpenAiApi.ChatModel.GPT_4_O.getValue()).build())); ---- -TIP: you can pass multiple images as well. +TIP: You can pass multiple images as well. -It takes as an input the `multimodal.test.png` image: +The example shows a model taking as an input the `multimodal.test.png` image: image::multimodal.test.png[Multimodal Test Image, 200, 200, align="left"] -along with the text message "Explain what do you see on this picture?", and generates a response like this: +along with the text message "Explain what do you see on this picture?", and generating a response like this: ---- This is an image of a fruit bowl with a simple design. The bowl is made of metal with curved wire edges that @@ -314,26 +312,25 @@ spring.ai.openai.chat.options.response-format.type=JSON_SCHEMA spring.ai.openai.chat.options.response-format.name=MySchemaName spring.ai.openai.chat.options.response-format.schema={"type":"object","properties":{"steps":{"type":"array","items":{"type":"object","properties":{"explanation":{"type":"string"},"output":{"type":"string"}},"required":["explanation","output"],"additionalProperties":false}},"final_answer":{"type":"string"}},"required":["steps","final_answer"],"additionalProperties":false} spring.ai.openai.chat.options.response-format.strict=true - ---- == Sample Controller https://start.spring.io/[Create] a new Spring Boot project and add the `spring-ai-openai-spring-boot-starter` to your pom (or gradle) dependencies. -Add a `application.properties` file, under the `src/main/resources` directory, to enable and configure the OpenAi chat model: +Add an `application.properties` file under the `src/main/resources` directory to enable and configure the OpenAi chat model: [source,application.properties] ---- spring.ai.openai.api-key=YOUR_API_KEY -spring.ai.openai.chat.options.model=gpt-3.5-turbo +spring.ai.openai.chat.options.model=gpt-4o spring.ai.openai.chat.options.temperature=0.7 ---- -TIP: replace the `api-key` with your OpenAI credentials. +TIP: Replace the `api-key` with your OpenAI credentials. -This will create a `OpenAiChatModel` implementation that you can inject into your class. -Here is an example of a simple `@Controller` class that uses the chat model for text generations. +This will create an `OpenAiChatModel` implementation that you can inject into your classes. +Here is an example of a simple `@RestController` class that uses the chat model for text generations. [source,java] ---- @@ -348,7 +345,7 @@ public class ChatController { } @GetMapping("/ai/generate") - public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) { + public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) { return Map.of("generation", chatModel.call(message)); } @@ -385,7 +382,7 @@ dependencies { TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file. -Next, create a `OpenAiChatModel` and use it for text generations: +Next, create an `OpenAiChatModel` and use it for text generations: [source,java] ---- @@ -394,10 +391,9 @@ var openAiChatOptions = OpenAiChatOptions.builder() .withModel("gpt-3.5-turbo") .withTemperature(0.4) .withMaxTokens(200) - .build(); + .build(); var chatModel = new OpenAiChatModel(openAiApi, openAiChatOptions); - ChatResponse response = chatModel.call( new Prompt("Generate the names of 5 famous pirates.")); @@ -407,7 +403,7 @@ Flux response = chatModel.stream( ---- The `OpenAiChatOptions` provides the configuration information for the chat requests. -The `OpenAiChatOptions.Builder` is fluent options builder. +The `OpenAiChatOptions.Builder` is a fluent options-builder. == Low-level OpenAiApi Client [[low-level-api]] @@ -417,7 +413,7 @@ Following class diagram illustrates the `OpenAiApi` chat interfaces and building image::openai-chat-api.jpg[OpenAiApi Chat API Diagram, width=1000, align="center"] -Here is a simple snippet how to use the api programmatically: +Here is a simple snippet showing how to use the API programmatically: [source,java] ---- @@ -439,8 +435,8 @@ Flux streamResponse = openAiApi.chatCompletionStream( Follow the https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/main/java/org/springframework/ai/openai/api/OpenAiApi.java[OpenAiApi.java]'s JavaDoc for further information. === Low-level API Examples -* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/test/java/org/springframework/ai/openai/api/OpenAiApiIT.java[OpenAiApiIT.java] test provides some general examples how to use the lightweight library. -* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/test/java/org/springframework/ai/openai/api/tool/OpenAiApiToolFunctionCallIT.java[OpenAiApiToolFunctionCallIT.java] test shows how to use the low-level API to call tool functions. -Based on the link:https://platform.openai.com/docs/guides/function-calling/parallel-function-calling[OpenAI Function Calling] tutorial. +* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/test/java/org/springframework/ai/openai/api/OpenAiApiIT.java[OpenAiApiIT.java] tests provide some general examples of how to use the lightweight library. +* The link:https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-openai/src/test/java/org/springframework/ai/openai/api/tool/OpenAiApiToolFunctionCallIT.java[OpenAiApiToolFunctionCallIT.java] tests show how to use the low-level API to call tool functions. +Based on the link:https://platform.openai.com/docs/guides/function-calling/parallel-function-calling[OpenAI Function Calling] tutorial. diff --git a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chatclient.adoc b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chatclient.adoc index 44f1fa94a77..bc76a776a43 100644 --- a/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chatclient.adoc +++ b/spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chatclient.adoc @@ -2,7 +2,7 @@ = Chat Client API The `ChatClient` offers a fluent API for communicating with an AI Model. -It supports both a synchronous and reactive programming model. +It supports both a synchronous and streaming programming model. The fluent API has methods for building up the constituent parts of a xref:api/prompt.adoc#_prompt[Prompt] that is passed to the AI model as input. The `Prompt` contains the instructional text to guide the AI model's output and behavior. From the API point of view, prompts consist of a collection of messages. @@ -21,7 +21,7 @@ You can obtain an autoconfigured `ChatClient.Builder` instance for any xref:api/ === Using an autoconfigured ChatClient.Builder In the most simple use case, Spring AI provides Spring Boot autoconfiguration, creating a prototype `ChatClient.Builder` bean for you to inject into your class. -Here is a simple example of retrieving a String response to a simple user request. +Here is a simple example of retrieving a `String` response to a simple user request. [source,java] ---- @@ -45,13 +45,13 @@ class MyController { ---- In this simple example, the user input sets the contents of the user message. -The call method sends a request to the AI model, and the content method returns the AI model's response as a String. +The `call()` method sends a request to the AI model, and the `content()` method returns the AI model's response as a `String`. === Create a ChatClient programmatically You can disable the `ChatClient.Builder` autoconfiguration by setting the property `spring.ai.chat.client.enabled=false`. This is useful if multiple chat models are used together. -Then create a `ChatClient.Builder` instance for for every `ChatModel` programmatically: +Then, create a `ChatClient.Builder` instance programmatically for every `ChatModel` you need: [source,java] ---- @@ -66,11 +66,11 @@ ChatClient chatClient = ChatClient.create(myChatModel); == ChatClient Responses -The ChatClient API offers several ways to format the response from the AI Model. +The `ChatClient` API offers several ways to format the response from the AI Model. === Returning a ChatResponse -The response from the AI model is a rich structure defined by the type xref:api/chatmodel.adoc#ChatResponse[ChatResponse]. +The response from the AI model is a rich structure defined by the type `xref:api/chatmodel.adoc#ChatResponse[ChatResponse]`. It includes metadata about how the response was generated and can also contain multiple responses, known as xref:api/chatmodel.adoc#Generation[Generation]s, each with its own metadata. The metadata includes the number of tokens (each token is approximately 3/4 of a word) used to create the response. This information is important because hosted AI models charge based on the number of tokens used per request. @@ -88,17 +88,16 @@ ChatResponse chatResponse = chatClient.prompt() === Returning an Entity You often want to return an entity class that is mapped from the returned `String`. -The `entity` method provides this functionality. +The `entity()` method provides this functionality. For example, given the Java record: [source,java] ---- -record ActorFilms(String actor, List movies) { -} +record ActorFilms(String actor, List movies) {} ---- -You can easily map the AI model's output to this record using the `entity` method, as shown below: +You can easily map the AI model's output to this record using the `entity()` method, as shown below: [source,java] ---- @@ -115,13 +114,12 @@ There is also an overloaded `entity` method with the signature `entity(Parameter List actorFilms = chatClient.prompt() .user("Generate the filmography of 5 movies for Tom Hanks and Bill Murray.") .call() - .entity(new ParameterizedTypeReference>() { - }); + .entity(new ParameterizedTypeReference>() {}); ---- === Streaming Responses -The `stream` lets you get an asynchronous response as shown below +The `stream()` method lets you get an asynchronous response as shown below: [source,java] ---- @@ -134,59 +132,58 @@ Flux output = chatClient.prompt() You can also stream the `ChatResponse` using the method `Flux chatResponse()`. -In the 1.0.0 M2 we will offer a convenience method that will let you return an Java entity with the reactive `stream()` method. +In the future, we will offer a convenience method that will let you return a Java entity with the reactive `stream()` method. In the meantime, you should use the xref:api/structured-output-converter.adoc#StructuredOutputConverter[Structured Output Converter] to convert the aggregated response explicity as shown below. This also demonstrates the use of parameters in the fluent API that will be discussed in more detail in a later section of the documentation. [source,java] ---- - var converter = new BeanOutputConverter<>(new ParameterizedTypeReference>() { - }); +var converter = new BeanOutputConverter<>(new ParameterizedTypeReference>() {}); - Flux flux = this.chatClient.prompt() - .user(u -> u.text(""" - Generate the filmography for a random actor. - {format} - """) - .param("format", converter.getFormat())) - .stream() - .content(); +Flux flux = this.chatClient.prompt() + .user(u -> u.text(""" + Generate the filmography for a random actor. + {format} + """) + .param("format", converter.getFormat())) + .stream() + .content(); - String content = flux.collectList().block().stream().collect(Collectors.joining()); +String content = flux.collectList().block().stream().collect(Collectors.joining()); - List actorFilms = converter.convert(content); +List actorFilms = converter.convert(content); ---- == call() return values -After specifying the `call` method on `ChatClient` there are a few different options for the response type. +After specifying the `call()` method on `ChatClient`, there are a few different options for the response type. * `String content()`: returns the String content of the response * `ChatResponse chatResponse()`: returns the `ChatResponse` object that contains multiple generations and also metadata about the response, for example how many token were used to create the response. -* `entity` to return a Java type -** entity(ParameterizedTypeReference type): used to return a Collection of entity types. -** entity(Class type): used to return a specific entity type. -** entity(StructuredOutputConverter structuredOutputConverter): used to specify an instance of a `StructuredOutputConverter` to convert a `String` to an entity type. - -You can also invoke the `stream` method instead of `call` and +* `entity()` to return a Java type +** `entity(ParameterizedTypeReference type)`: used to return a `Collection` of entity types. +** `entity(Class type)`: used to return a specific entity type. +** `entity(StructuredOutputConverter structuredOutputConverter)`: used to specify an instance of a `StructuredOutputConverter` to convert a `String` to an entity type. +You can also invoke the `stream()` method instead of `call()`. == stream() return values -After specifying the `stream` method on `ChatClient`, there are a few options for the response type: +After specifying the `stream()` method on `ChatClient`, there are a few options for the response type: -* `Flux content()`: Returns a Flux of the string being generated by the AI model. -* `Flux chatResponse()`: Returns a Flux of the `ChatResponse` object, which contains additional metadata about the response. +* `Flux content()`: Returns a `Flux` of the string being generated by the AI model. +* `Flux chatResponse()`: Returns a `Flux` of the `ChatResponse` object, which contains additional metadata about the response. == Using Defaults -Creating a ChatClient with default system text in an `@Configuration` class simplifies runtime code. -By setting defaults, you only need to specify user text when calling `ChatClient`, eliminating the need to set system text for each request in your runtime code path. +Creating a `ChatClient` with a default system text in an `@Configuration` class simplifies runtime code. +By setting defaults, you only need to specify the user text when calling `ChatClient`, eliminating the need to set a system text for each request in your runtime code path. === Default System Text In the following example, we will configure the system text to always reply in a pirate's voice. -To avoid repeating the system text in runtime code, we will create a `ChatClient` instance in an `@Configuration` class. +To avoid repeating the system text in runtime code, we will create a `ChatClient` instance in a `@Configuration` class. + [source,java] ---- @Configuration @@ -201,7 +198,7 @@ class Config { } ---- -and an `@RestController` to invoke it +and a `@RestController` to invoke it: [source,java] ---- @@ -221,7 +218,7 @@ class AIController { } ---- -invoking it via curl gives +When calling the application endpoint via curl, the result is: [source,bash] ---- @@ -251,24 +248,26 @@ class Config { ---- @RestController class AIController { - private final ChatClient chatClient + private final ChatClient chatClient; + AIController(ChatClient chatClient) { this.chatClient = chatClient; } + @GetMapping("/ai") Map completion(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message, String voice) { - return Map.of( - "completion", + return Map.of("completion", chatClient.prompt() .system(sp -> sp.param("voice", voice)) .user(message) .call() .content()); } + } ---- -The response is +When calling the application endpoint via httpie, the result is: [source.bash] ---- @@ -280,7 +279,7 @@ http localhost:8080/ai voice=='Robert DeNiro' === Other defaults -At the `ChatClient.Builder` level, you can specify the default prompt. +At the `ChatClient.Builder` level, you can specify the default prompt configuration. * `defaultOptions(ChatOptions chatOptions)`: Pass in either portable options defined in the `ChatOptions` class or model-specific options such as those in `OpenAiChatOptions`. For more information on model-specific `ChatOptions` implementations, refer to the JavaDocs. @@ -290,7 +289,6 @@ At the `ChatClient.Builder` level, you can specify the default prompt. * `defaultUser(String text)`, `defaultUser(Resource text)`, `defaultUser(Consumer userSpecConsumer)`: These methods let you define the user text. The `Consumer` allows you to use a lambda to specify the user text and any default parameters. - * `defaultAdvisors(RequestResponseAdvisor... advisor)`: Advisors allow modification of the data used to create the `Prompt`. The `QuestionAnswerAdvisor` implementation enables the pattern of `Retrieval Augmented Generation` by appending the prompt with context information related to the user text. * `defaultAdvisors(Consumer advisorSpecConsumer)`: This method allows you to define a `Consumer` to configure multiple advisors using the `AdvisorSpec`. Advisors can modify the data used to create the final `Prompt`. The `Consumer` lets you specify a lambda to add advisors, such as `QuestionAnswerAdvisor`, which supports `Retrieval Augmented Generation` by appending the prompt with relevant context information based on the user text. @@ -302,15 +300,14 @@ You can override these defaults at runtime using the corresponding methods witho * `function(String name, String description, java.util.function.Function function)` -* `functions(String... functionNames) +* `functions(String... functionNames)` -* `user(String text)` , `user(Resource text)`, `user(Consumer userSpecConsumer)` +* `user(String text)`, `user(Resource text)`, `user(Consumer userSpecConsumer)` * `advisors(RequestResponseAdvisor... advisor)` * `advisors(Consumer advisorSpecConsumer)` - == Advisors A common pattern when calling an AI model with user text is to append or augment the prompt with contextual data. @@ -330,7 +327,6 @@ The response from the vector database is appended to the user text to provide co Assuming you have already loaded data into a `VectorStore`, you can perform Retrieval Augmented Generation (RAG) by providing an instance of `QuestionAnswerAdvisor` to the `ChatClient`. - [source,java] ---- ChatResponse response = ChatClient.builder(chatModel) @@ -342,7 +338,7 @@ ChatResponse response = ChatClient.builder(chatModel) ---- In this example, the `SearchRequest.defaults()` will perform a similarity search over all documents in the Vector Database. -To restrict the types of documents that are searched, the `SearchRequest` takes a SQL like filter expression that is portable across all `VectorStores`. +To restrict the types of documents that are searched, the `SearchRequest` takes an SQL like filter expression that is portable across all `VectorStores`. ==== Dynamic Filter Expressions @@ -368,9 +364,10 @@ The `FILTER_EXPRESSION` parameter allows you to dynamically filter the search re The interface `ChatMemory` represents a storage for chat conversation history. It provides methods to add messages to a conversation, retrieve messages from a conversation, and clear the conversation history. -There are two implementations `InMemoryChatMemory` and `CassandraChatMemory` that provides storage for chat conversation history, in-memory and persisted with `time-to-live` correspondingly. +There are currently two implementations, `InMemoryChatMemory` and `CassandraChatMemory`, that provide storage for chat conversation history, in-memory and persisted with `time-to-live`, correspondingly. + +To create a `CassandraChatMemory` with `time-to-live`: -To create a `CassandraChatMemory` with `time-to-live` [source,java] ---- CassandraChatMemory.create(CassandraChatMemoryConfig.builder().withTimeToLive(Duration.ofDays(1)).build()); @@ -382,11 +379,10 @@ The following advisor implementations use the `ChatMemory` interface to advice t * `PromptChatMemoryAdvisor` : Memory is retrieved and added into the prompt's system text. * `VectorStoreChatMemoryAdvisor` : The constructor `VectorStoreChatMemoryAdvisor(VectorStore vectorStore, String defaultConversationId, int chatHistoryWindowSize)` lets you specify the VectorStore to retrieve the chat history from, the unique conversation ID, the size of the chat history to be retrieved in token size. -A sample `@Service` implementation that uses several advisors is shown below +A sample `@Service` implementation that uses several advisors is shown below. [source,java] ---- - import static org.springframework.ai.chat.client.advisor.AbstractChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY; import static org.springframework.ai.chat.client.advisor.AbstractChatMemoryAdvisor.CHAT_MEMORY_RETRIEVE_SIZE_KEY; @@ -397,7 +393,7 @@ public class CustomerSupportAssistant { public CustomerSupportAssistant(ChatClient.Builder builder, VectorStore vectorStore, ChatMemory chatMemory) { - this.chatClient = builder + this.chatClient = builder .defaultSystem(""" You are a customer chat support agent of an airline named "Funnair".", Respond in a friendly, helpful, and joyful manner. @@ -412,30 +408,33 @@ public class CustomerSupportAssistant { .defaultAdvisors( new PromptChatMemoryAdvisor(chatMemory), // new MessageChatMemoryAdvisor(chatMemory), // CHAT MEMORY - new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()), - new LoggingAdvisor()) // RAG + new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()), // RAG + new LoggingAdvisor()) .defaultFunctions("getBookingDetails", "changeBooking", "cancelBooking") // FUNCTION CALLING .build(); -} + } -public Flux chat(String chatId, String userMessageContent) { + public Flux chat(String chatId, String userMessageContent) { - return this.chatClient.prompt() - .user(userMessageContent) - .advisors(a -> a - .param(CHAT_MEMORY_CONVERSATION_ID_KEY, chatId) - .param(CHAT_MEMORY_RETRIEVE_SIZE_KEY, 100)) - .stream().content(); + return this.chatClient.prompt() + .user(userMessageContent) + .advisors(a -> a + .param(CHAT_MEMORY_CONVERSATION_ID_KEY, chatId) + .param(CHAT_MEMORY_RETRIEVE_SIZE_KEY, 100)) + .stream().content(); } + } ---- === Logging -The `SimpleLoggerAdvisor` is an advisor that logs the `request` and `response` data of the ChatClient. +The `SimpleLoggerAdvisor` is an advisor that logs the `request` and `response` data of the `ChatClient`. This can be useful for debugging and monitoring your AI interactions. -To enable logging, add the `SimpleLoggerAdvisor` to the advisor chain when creating your ChatClient. +TIP: Spring AI supports observability for LLM and vector store interactions. Refer to the xref:observabilty/index.adoc[Observability] guide for more information. + +To enable logging, add the `SimpleLoggerAdvisor` to the advisor chain when creating your ChatClient. It's recommended to add it toward the end of the chain: [source,java] @@ -455,8 +454,7 @@ logging.level.org.springframework.ai.chat.client.advisor=DEBUG Add this to your `application.properties` or `application.yaml` file. - -You can customize what data from AdvisedRequest and ChatResponse is logged by using the following constructor: +You can customize what data from `AdvisedRequest` and `ChatResponse` is logged by using the following constructor: [source,java] ---- @@ -478,4 +476,4 @@ javaCopySimpleLoggerAdvisor customLogger = new SimpleLoggerAdvisor( This allows you to tailor the logged information to your specific needs. -TIP: Be cautious about logging sensitive information in production environments. \ No newline at end of file +TIP: Be cautious about logging sensitive information in production environments.