-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add AWS Bedrock AI support #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
d637646
Add AWS Bedrock AI support
tzolov f26e41d
Remove json-starter in core and fix OpenAI and Vertex API changes
tzolov 1e4c10b
Add BedrockAi APIs AOT hints
tzolov 55287d0
Add Ai21Jurassic2ChatBedrockApi test
tzolov 13c1323
Add TitanEmbeddingBedrockApi test
tzolov 289bc8d
Add TitanChatBedrockApi test
tzolov 9c6c81f
address review comments
tzolov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# Bedrock AI Chat and Embedding Clients | ||
|
||
[Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) is a managed service that provides foundation models from various AI providers, available through a unified API. | ||
|
||
Spring AI implements `API` clients for the [Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids-arns.html) along with implementations for the `AiClient`, `AiStreamingClient` and the `EmbeddingClient`. | ||
|
||
The API clients provide structured, type-safe implementation for the Bedrock models, while the `AiClient`, `AiStreamingClient` and the `EmbeddingClient` implementations provide Chat and Embedding clients compliant with the Spring-AI API. Later can be used interchangeably with the other (e.g. OpenAI, Azure OpenAI, | ||
Ollama) model clients. | ||
|
||
Also Spring-AI provides Spring Auto-Configurations and Boot Starters for all clients, making it easy to bootstrap and configure for the Bedrocks models. | ||
|
||
## Prerequisite | ||
|
||
* AWS credentials. | ||
|
||
If you dont have AWS account and AWS Cli configured yet then this video guide can help you to configure it: [AWS CLI & SDK Setup in Less Than 4 Minutes!](https://youtu.be/gswVHTrRX8I?si=buaY7aeI0l3-bBVb). | ||
You should be able to obtain your access and security keys. | ||
|
||
* Enable Bedrock models to use | ||
|
||
Go to [Amazon Bedrock](https://us-east-1.console.aws.amazon.com/bedrock/home) and from the [Model Access](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess) menu on the left configure the access to the models you are going to use. | ||
|
||
## Quick start | ||
|
||
Add the `spring-ai-bedrock-ai-spring-boot-starter` dependency to your project POM: | ||
|
||
```xml | ||
<dependency> | ||
<artifactId>spring-ai-bedrock-ai-spring-boot-starter</artifactId> | ||
<groupId>org.springframework.ai</groupId> | ||
<version>0.8.0-SNAPSHOT</version> | ||
</dependency> | ||
``` | ||
|
||
### Connect to AWS Bedrock | ||
|
||
Use the `BedrockAwsConnectionProperties` to configure the AWS credentials and region: | ||
|
||
```shell | ||
spring.ai.bedrock.aws.region=us-east-1 | ||
|
||
spring.ai.bedrock.aws.access-key=YOUR_ACCESS_KEY | ||
spring.ai.bedrock.aws.secret-key=YOUR_SECRET_KEY | ||
``` | ||
|
||
The `region` property is compulsory. | ||
|
||
The AWS credentials are resolved in the following this order: | ||
|
||
* Spring-AI Bedrock `spring.ai.bedrock.aws.access-key` and `spring.ai.bedrock.aws.secret-key` properties. | ||
* Java System Properties - `aws.accessKeyId` and `aws.secretAccessKey` | ||
* Environment Variables - `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` | ||
* Web Identity Token credentials from system properties or environment variables | ||
* Credential profiles file at the default location (`~/.aws/credentials`) shared by all AWS SDKs and the AWS CLI | ||
* Credentials delivered through the Amazon EC2 container service if `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI`" environment variable is set and security manager has permission to access the variable, | ||
* Instance profile credentials delivered through the Amazon EC2 metadata service or set the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables. | ||
|
||
### Enable selected Bedrock model | ||
|
||
> **NOTE**: By default all models are disabled. You have to enable the chosen Bedrock models explicitly, using the `spring.ai.bedrock.<model>.<chat|embedding>.enabled=true` property. | ||
|
||
Here are the supported `<model>` and `<chat|embedding>` combinations: | ||
|
||
| Model | Chat | Chat Streaming | Embedding | | ||
| ------------- | ------------- | ------------- | ------------- | | ||
| llama2 | Yes | Yes | No | | ||
| cohere | Yes | Yes | Yes | | ||
| anthropic | Yes | Yes | No | | ||
| jurassic2 | Yes | No | No | | ||
| titan | Yes | Yes | Yes (no batch mode!) | | ||
|
||
For example to enable the bedrock Llama2 Chat client you need to set the | ||
`spring.ai.bedrock.llama2.chat.enabled=true`. | ||
|
||
Next you can use the `spring.ai.bedrock.<model>.<chat|embedding>.*` properties to configure each model as provided in its documentation: | ||
|
||
* [Spring AI Bedrock Llama2 Chat](./README_LLAMA2_CHAT.md) - `spring.ai.bedrock.llama2.chat=true` | ||
* [Spring AI Bedrock Cohere Chat](./README_COHERE_CHAT.md) - `spring.ai.bedrock.cohere.chat=true` | ||
* [Spring AI Bedrock Cohere Embedding](./README_COHERE_EMBEDDING.md) - `spring.ai.bedrock.cohere.embedding=true` | ||
* [Spring AI Bedrock Anthropic Chat](./README_ANTHROPIC_CHAT.md) - `spring.ai.bedrock.anthropic.chat=true` | ||
* (WIP) [Spring AI Bedrock Titan Chat](./README_TITAN_CHAT.md) - `spring.ai.bedrock.titan.chat=true` | ||
* (WIP) [Spring AI Bedrock Titan Embedding](./README_TITAN_EMBEDING.md) - `spring.ai.bedrock.titan.embedding=true` | ||
* (WIP) [Spring AI Bedrock Ai21 Jurassic2 Chat](./README_JURASSIC2_CHAT.md) - `spring.ai.bedrock.jurassic2.chat=true` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# 1. Bedrock Anthropic | ||
|
||
Provides Bedrock Anthropic Chat API and Spring-AI chat clients. | ||
|
||
## 1.1 AnthropicChatBedrockApi | ||
|
||
[AnthropicChatBedrockApi](./src/main/java/org/springframework/ai/bedrock/anthropic/api/AnthropicChatBedrockApi.java) provides is lightweight Java client on top of AWS Bedrock [Anthropic Claude models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-claude.html). | ||
|
||
Following class diagram illustrates the Llama2ChatBedrockApi interface and building blocks: | ||
|
||
 | ||
|
||
The AnthropicChatBedrockApi supports the `anthropic.claude-instant-v1` and `anthropic.claude-v2` models. | ||
|
||
Also the AnthropicChatBedrockApi supports both synchronous (e.g. `chatCompletion()`) and streaming (e.g. `chatCompletionStream()`) responses. | ||
|
||
Here is a simple snippet how to use the api programmatically: | ||
|
||
```java | ||
AnthropicChatBedrockApi anthropicChatApi = new AnthropicChatBedrockApi( | ||
AnthropicModel.CLAUDE_V2.id(), | ||
Region.EU_CENTRAL_1.id()); | ||
|
||
AnthropicChatRequest request = AnthropicChatRequest | ||
.builder(String.format(AnthropicChatBedrockApi.PROMPT_TEMPLATE, "Name 3 famous pirates")) | ||
.withTemperature(0.8f) | ||
.withMaxTokensToSample(300) | ||
.withTopK(10) | ||
// .withStopSequences(List.of("\n\nHuman:")) | ||
.build(); | ||
|
||
AnthropicChatResponse response = anthropicChatApi.chatCompletion(request); | ||
|
||
System.out.println(response.completion()); | ||
|
||
// Streaming response | ||
Flux<AnthropicChatResponse> responseStream = anthropicChatApi.chatCompletionStream(request); | ||
|
||
List<AnthropicChatResponse> responses = responseStream.collectList().block(); | ||
|
||
System.out.println(responses); | ||
``` | ||
|
||
Follow the [AnthropicChatBedrockApi.java](./src/main/java/org/springframework/ai/bedrock/anthropic/api/AnthropicChatBedrockApi.java)'s JavaDoc for further information. | ||
|
||
## 1.2 BedrockAnthropicChatClient | ||
|
||
[BedrockAnthropicChatClient](./src/main/java/org/springframework/ai/bedrock/anthropic/BedrockAnthropicChatClient.java) implements the Spring-Ai `AiClient` and `AiStreamingClient` on top of the `AnthropicChatBedrockApi`. | ||
|
||
You can use like this: | ||
|
||
```java | ||
@Bean | ||
public AnthropicChatBedrockApi anthropicApi() { | ||
return new AnthropicChatBedrockApi( | ||
AnthropicChatBedrockApi.AnthropicModel.CLAUDE_V2.id(), | ||
EnvironmentVariableCredentialsProvider.create(), | ||
Region.EU_CENTRAL_1.id(), | ||
new ObjectMapper()); | ||
} | ||
|
||
@Bean | ||
public BedrockAnthropicChatClient anthropicChatClient(AnthropicChatBedrockApi anthropicApi) { | ||
return new BedrockAnthropicChatClient(anthropicApi); | ||
} | ||
``` | ||
|
||
or you can leverage the `spring-ai-bedrock-ai-spring-boot-starter` Spring Boot starter: | ||
|
||
```xml | ||
<dependency> | ||
<artifactId>spring-ai-bedrock-ai-spring-boot-starter</artifactId> | ||
<groupId>org.springframework.ai</groupId> | ||
<version>0.8.0-SNAPSHOT</version> | ||
</dependency> | ||
``` | ||
|
||
And set `spring.ai.bedrock.anthropic.chat.enabled=true`. | ||
By default the client is disabled. | ||
|
||
Use the `BedrockAnthropicChatProperties` to configure the Bedrock Llama2 Chat client: | ||
|
||
| Property | Description | Default | | ||
| ------------- | ------------- | ------------- | | ||
| spring.ai.bedrock.anthropic.chat.enable | Enable Bedrock Llama2 chat client. Disabled by default | false | | ||
| spring.ai.bedrock.anthropic.chat.awsRegion | AWS region to use. | us-east-1 | | ||
| spring.ai.bedrock.anthropic.chat.temperature | Controls the randomness of the output. Values can range over [0.0,1.0] | 0.8 | | ||
| spring.ai.bedrock.anthropic.chat.topP | The maximum cumulative probability of tokens to consider when sampling. | AWS Bedrock default | | ||
| spring.ai.bedrock.anthropic.chat.maxGenLen | Specify the maximum number of tokens to use in the generated response. | 300 | | ||
| spring.ai.bedrock.anthropic.chat.model | The model id to use. See the `Llama2ChatCompletionModel` for the supported models. | meta.llama2-70b-chat-v1 | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,178 @@ | ||
# 1. Bedrock Cohere Chat | ||
|
||
Provides Bedrock Cohere Chat clients. | ||
|
||
## 1.1 CohereChatBedrockApi | ||
|
||
[CohereChatBedrockApi](./src/main/java/org/springframework/ai/bedrock/cohere/api/CohereChatBedrockApi.java) provides is lightweight Java client on top of AWS Bedrock [Cohere Command models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-cohere-command.html). | ||
|
||
Following class diagram illustrates the Llama2ChatBedrockApi interface and building blocks: | ||
|
||
 | ||
|
||
The CohereChatBedrockApi supports the `cohere.command-light-text-v14` and `cohere.command-text-v14` models for bot synchronous (e.g. `chatCompletion()`) and streaming (e.g. `chatCompletionStream()`) responses. | ||
|
||
Here is a simple snippet how to use the api programmatically: | ||
|
||
```java | ||
CohereChatBedrockApi cohereChatApi = new CohereChatBedrockApi( | ||
CohereChatModel.COHERE_COMMAND_V14.id(), | ||
Region.US_EAST_1.id()); | ||
|
||
var request = CohereChatRequest | ||
.builder("What is the capital of Bulgaria and what is the size? What it the national anthem?") | ||
.withStream(false) | ||
.withTemperature(0.5f) | ||
.withTopP(0.8f) | ||
.withTopK(15) | ||
.withMaxTokens(100) | ||
.withStopSequences(List.of("END")) | ||
.withReturnLikelihoods(CohereChatRequest.ReturnLikelihoods.ALL) | ||
.withNumGenerations(3) | ||
.withLogitBias(null) | ||
.withTruncate(Truncate.NONE) | ||
.build(); | ||
|
||
CohereChatResponse response = cohereChatApi.chatCompletion(request); | ||
|
||
var request = CohereChatRequest | ||
.builder("What is the capital of Bulgaria and what is the size? What it the national anthem?") | ||
.withStream(true) | ||
.withTemperature(0.5f) | ||
.withTopP(0.8f) | ||
.withTopK(15) | ||
.withMaxTokens(100) | ||
.withStopSequences(List.of("END")) | ||
.withReturnLikelihoods(CohereChatRequest.ReturnLikelihoods.ALL) | ||
.withNumGenerations(3) | ||
.withLogitBias(null) | ||
.withTruncate(Truncate.NONE) | ||
.build(); | ||
|
||
Flux<CohereChatResponse.Generation> responseStream = cohereChatApi.chatCompletionStream(request); | ||
List<CohereChatResponse.Generation> responses = responseStream.collectList().block(); | ||
``` | ||
|
||
## 1.2 BedrockCohereChatClient | ||
|
||
[BedrockCohereChatClient](./src/main/java/org/springframework/ai/bedrock/cohere/BedrockCohereChatClient.java) implements the Spring-Ai `AiClient` and `AiStreamingClient` on top of the `CohereChatBedrockApi`. | ||
|
||
You can use like this: | ||
|
||
```java | ||
@Bean | ||
public CohereChatBedrockApi cohereApi() { | ||
return new CohereChatBedrockApi( | ||
CohereChatModel.COHERE_COMMAND_V14.id(), | ||
EnvironmentVariableCredentialsProvider.create(), | ||
Region.US_EAST_1.id(), | ||
new ObjectMapper()); | ||
} | ||
|
||
@Bean | ||
public BedrockCohereChatClient cohereChatClient(CohereChatBedrockApi cohereApi) { | ||
return new BedrockCohereChatClient(cohereApi); | ||
} | ||
``` | ||
|
||
or you can leverage the `spring-ai-bedrock-ai-spring-boot-starter` Boot starter. For this add the following dependency: | ||
|
||
```xml | ||
<dependency> | ||
<artifactId>spring-ai-bedrock-ai-spring-boot-starter</artifactId> | ||
<groupId>org.springframework.ai</groupId> | ||
<version>0.8.0-SNAPSHOT</version> | ||
</dependency> | ||
``` | ||
|
||
**NOTE:** You have to enable the Bedrock Cohere chat client with `spring.ai.bedrock.cohere.chat.enabled=true`. | ||
By default the client is disabled. | ||
|
||
Use the `BedrockCohereChatProperties` to configure the Bedrock Cohere Chat client: | ||
|
||
| Property | Description | Default | | ||
| ------------- | ------------- | ------------- | | ||
| spring.ai.bedrock.cohere.chat.enable | Enable Bedrock Cohere chat client. Disabled by default | false | | ||
| spring.ai.bedrock.cohere.chat.awsRegion | AWS region to use. | us-east-1 | | ||
| spring.ai.bedrock.cohere.chat.model | The model id to use. See the `CohereChatModel` for the supported models. | cohere.command-text-v14 | | ||
| spring.ai.bedrock.cohere.chat.temperature | Controls the randomness of the output. Values can range over [0.0,1.0] | 0.7 | | ||
| spring.ai.bedrock.cohere.chat.topP | The maximum cumulative probability of tokens to consider when sampling. | AWS Bedrock default | | ||
| spring.ai.bedrock.cohere.chat.topK | Specify the number of token choices the model uses to generate the next token | AWS Bedrock default | | ||
| spring.ai.bedrock.cohere.chat.maxTokens | Specify the maximum number of tokens to use in the generated response. | AWS Bedrock default | | ||
| spring.ai.bedrock.cohere.chat.stopSequences | Configure up to four sequences that the model recognizes. | AWS Bedrock default | | ||
| spring.ai.bedrock.cohere.chat.returnLikelihoods | The token likelihoods are returned with the response. | AWS Bedrock default | | ||
| spring.ai.bedrock.cohere.chat.numGenerations | The maximum number of generations that the model should return. | AWS Bedrock default | | ||
| spring.ai.bedrock.cohere.chat.logitBiasToken | Prevents the model from generating unwanted tokens or incentivize the model to include desired tokens. | AWS Bedrock default | | ||
| spring.ai.bedrock.cohere.chat.logitBiasBias | Prevents the model from generating unwanted tokens or incentivize the model to include desired tokens. | AWS Bedrock default | | ||
| spring.ai.bedrock.cohere.chat.truncate | Specifies how the API handles inputs longer than the maximum token length | AWS Bedrock default | | ||
|
||
### 2.3 CohereEmbeddingBedrockApi | ||
|
||
[CohereEmbeddingBedrockApi](./src/main/java/org/springframework/ai/bedrock/cohere/api/CohereEmbeddingBedrockApi.java) provides is lightweight Java client on top of AWS Bedrock [Cohere Embed models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed.html). | ||
|
||
Following class diagram illustrates the Llama2ChatBedrockApi interface and building blocks: | ||
|
||
 | ||
|
||
The CohereEmbeddingBedrockApi supports the `cohere.embed-english-v3` and `cohere.embed-multilingual-v3` models for single and batch embedding computation. | ||
|
||
Here is a simple snippet how to use the api programmatically: | ||
|
||
```java | ||
CohereEmbeddingBedrockApi api = new CohereEmbeddingBedrockApi( | ||
CohereEmbeddingModel.COHERE_EMBED_MULTILINGUAL_V1.id(), | ||
EnvironmentVariableCredentialsProvider.create(), | ||
Region.US_EAST_1.id(), new ObjectMapper()); | ||
|
||
CohereEmbeddingRequest request = new CohereEmbeddingRequest( | ||
List.of("I like to eat apples", "I like to eat oranges"), | ||
CohereEmbeddingRequest.InputType.search_document, | ||
CohereEmbeddingRequest.Truncate.NONE); | ||
|
||
CohereEmbeddingResponse response = api.embedding(request); | ||
|
||
assertThat(response.embeddings()).hasSize(2); | ||
assertThat(response.embeddings().get(0)).hasSize(1024); | ||
``` | ||
|
||
### 2.4 BedrockCohereEmbeddingClient | ||
|
||
[BedrockCohereEmbeddingClient](./src/main/java/org/springframework/ai/bedrock/cohere/BedrockCohereEmbeddingClient.java) implements the Spring-Ai `EmbeddingClient` on top of the `CohereEmbeddingBedrockApi`. | ||
|
||
You can use like this: | ||
|
||
```java | ||
@Bean | ||
public CohereEmbeddingBedrockApi cohereEmbeddingApi() { | ||
return new CohereEmbeddingBedrockApi(CohereEmbeddingModel.COHERE_EMBED_MULTILINGUAL_V1.id(), | ||
EnvironmentVariableCredentialsProvider.create(), Region.US_EAST_1.id(), new ObjectMapper()); | ||
} | ||
|
||
@Bean | ||
public BedrockCohereEmbeddingClient cohereAiEmbedding(CohereEmbeddingBedrockApi cohereEmbeddingApi) { | ||
return new BedrockCohereEmbeddingClient(cohereEmbeddingApi); | ||
} | ||
``` | ||
|
||
or you can leverage the `spring-ai-bedrock-ai-spring-boot-starter` Boot starter. For this add the following dependency: | ||
|
||
```xml | ||
<dependency> | ||
<artifactId>spring-ai-bedrock-ai-spring-boot-starter</artifactId> | ||
<groupId>org.springframework.ai</groupId> | ||
<version>0.8.0-SNAPSHOT</version> | ||
</dependency> | ||
``` | ||
|
||
**NOTE:** You have to enable the Bedrock Cohere chat client with `spring.ai.bedrock.cohere.embedding.enabled=true`. | ||
By default the client is disabled. | ||
|
||
Use the `BedrockCohereEmbeddingProperties` to configure the Bedrock Cohere Chat client: | ||
|
||
| Property | Description | Default | | ||
| ------------- | ------------- | ------------- | | ||
| spring.ai.bedrock.cohere.embedding.enable | Enable Bedrock Cohere chat client. Disabled by default | false | | ||
| spring.ai.bedrock.cohere.embedding.awsRegion | AWS region to use. | us-east-1 | | ||
| spring.ai.bedrock.cohere.embedding.model | The model id to use. See the `CohereEmbeddingModel` for the supported models. | cohere.embed-multilingual-v3 | | ||
| spring.ai.bedrock.cohere.embedding.inputType | Prepends special tokens to differentiate each type from one another. You should not mix different types together, except when mixing types for for search and retrieval. In this case, embed your corpus with the search_document type and embedded queries with type search_query type. | search_document | | ||
| spring.ai.bedrock.cohere.embedding.truncate | Specifies how the API handles inputs longer than the maximum token length. | NONE | |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for these base clients still needs to go into our reference documentation, which is in adoc format. I will create a separate issue for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree