Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@
<module>vector-stores/spring-ai-redis</module>
<module>spring-ai-vertex-ai</module>
<module>spring-ai-spring-boot-starters/spring-ai-starter-vertex-ai</module>
<module>spring-ai-bedrock</module>
<module>spring-ai-spring-boot-starters/spring-ai-starter-bedrock-ai</module>

</modules>

Expand Down Expand Up @@ -98,6 +100,7 @@
<azure-open-ai-client.version>1.0.0-beta.3</azure-open-ai-client.version>
<jtokkit.version>0.6.1</jtokkit.version>
<victools.version>4.31.1</victools.version>
<bedrockruntime.version>2.22.0</bedrockruntime.version>

<!-- readers/writer/stores dependencies-->
<pdfbox.version>3.0.0</pdfbox.version>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ public AiResponse generate(Prompt prompt) {
List<ChatMessage> azureMessages = new ArrayList<>();

for (Message message : messages) {
String messageType = message.getMessageTypeValue();
String messageType = message.getMessageType().getValue();
ChatRole chatRole = ChatRole.fromString(messageType);
azureMessages.add(new ChatMessage(chatRole, message.getContent()));
}
Expand Down
83 changes: 83 additions & 0 deletions spring-ai-bedrock/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Bedrock AI Chat and Embedding Clients
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for these base clients still needs to go into our reference documentation, which is in adoc format. I will create a separate issue for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree


[Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) is a managed service that provides foundation models from various AI providers, available through a unified API.

Spring AI implements `API` clients for the [Bedrock models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids-arns.html) along with implementations for the `AiClient`, `AiStreamingClient` and the `EmbeddingClient`.

The API clients provide structured, type-safe implementation for the Bedrock models, while the `AiClient`, `AiStreamingClient` and the `EmbeddingClient` implementations provide Chat and Embedding clients compliant with the Spring-AI API. Later can be used interchangeably with the other (e.g. OpenAI, Azure OpenAI,
Ollama) model clients.

Also Spring-AI provides Spring Auto-Configurations and Boot Starters for all clients, making it easy to bootstrap and configure for the Bedrocks models.

## Prerequisite

* AWS credentials.

If you dont have AWS account and AWS Cli configured yet then this video guide can help you to configure it: [AWS CLI & SDK Setup in Less Than 4 Minutes!](https://youtu.be/gswVHTrRX8I?si=buaY7aeI0l3-bBVb).
You should be able to obtain your access and security keys.

* Enable Bedrock models to use

Go to [Amazon Bedrock](https://us-east-1.console.aws.amazon.com/bedrock/home) and from the [Model Access](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess) menu on the left configure the access to the models you are going to use.

## Quick start

Add the `spring-ai-bedrock-ai-spring-boot-starter` dependency to your project POM:

```xml
<dependency>
<artifactId>spring-ai-bedrock-ai-spring-boot-starter</artifactId>
<groupId>org.springframework.ai</groupId>
<version>0.8.0-SNAPSHOT</version>
</dependency>
```

### Connect to AWS Bedrock

Use the `BedrockAwsConnectionProperties` to configure the AWS credentials and region:

```shell
spring.ai.bedrock.aws.region=us-east-1

spring.ai.bedrock.aws.access-key=YOUR_ACCESS_KEY
spring.ai.bedrock.aws.secret-key=YOUR_SECRET_KEY
```

The `region` property is compulsory.

The AWS credentials are resolved in the following this order:

* Spring-AI Bedrock `spring.ai.bedrock.aws.access-key` and `spring.ai.bedrock.aws.secret-key` properties.
* Java System Properties - `aws.accessKeyId` and `aws.secretAccessKey`
* Environment Variables - `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`
* Web Identity Token credentials from system properties or environment variables
* Credential profiles file at the default location (`~/.aws/credentials`) shared by all AWS SDKs and the AWS CLI
* Credentials delivered through the Amazon EC2 container service if `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI`" environment variable is set and security manager has permission to access the variable,
* Instance profile credentials delivered through the Amazon EC2 metadata service or set the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables.

### Enable selected Bedrock model

> **NOTE**: By default all models are disabled. You have to enable the chosen Bedrock models explicitly, using the `spring.ai.bedrock.<model>.<chat|embedding>.enabled=true` property.

Here are the supported `<model>` and `<chat|embedding>` combinations:

| Model | Chat | Chat Streaming | Embedding |
| ------------- | ------------- | ------------- | ------------- |
| llama2 | Yes | Yes | No |
| cohere | Yes | Yes | Yes |
| anthropic | Yes | Yes | No |
| jurassic2 | Yes | No | No |
| titan | Yes | Yes | Yes (no batch mode!) |

For example to enable the bedrock Llama2 Chat client you need to set the
`spring.ai.bedrock.llama2.chat.enabled=true`.

Next you can use the `spring.ai.bedrock.<model>.<chat|embedding>.*` properties to configure each model as provided in its documentation:

* [Spring AI Bedrock Llama2 Chat](./README_LLAMA2_CHAT.md) - `spring.ai.bedrock.llama2.chat=true`
* [Spring AI Bedrock Cohere Chat](./README_COHERE_CHAT.md) - `spring.ai.bedrock.cohere.chat=true`
* [Spring AI Bedrock Cohere Embedding](./README_COHERE_EMBEDDING.md) - `spring.ai.bedrock.cohere.embedding=true`
* [Spring AI Bedrock Anthropic Chat](./README_ANTHROPIC_CHAT.md) - `spring.ai.bedrock.anthropic.chat=true`
* (WIP) [Spring AI Bedrock Titan Chat](./README_TITAN_CHAT.md) - `spring.ai.bedrock.titan.chat=true`
* (WIP) [Spring AI Bedrock Titan Embedding](./README_TITAN_EMBEDING.md) - `spring.ai.bedrock.titan.embedding=true`
* (WIP) [Spring AI Bedrock Ai21 Jurassic2 Chat](./README_JURASSIC2_CHAT.md) - `spring.ai.bedrock.jurassic2.chat=true`
90 changes: 90 additions & 0 deletions spring-ai-bedrock/README_ANTHROPIC_CHAT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# 1. Bedrock Anthropic

Provides Bedrock Anthropic Chat API and Spring-AI chat clients.

## 1.1 AnthropicChatBedrockApi

[AnthropicChatBedrockApi](./src/main/java/org/springframework/ai/bedrock/anthropic/api/AnthropicChatBedrockApi.java) provides is lightweight Java client on top of AWS Bedrock [Anthropic Claude models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-claude.html).

Following class diagram illustrates the Llama2ChatBedrockApi interface and building blocks:

![AnthropicChatBedrockApi Class Diagram](./src/test/resources/doc/Bedrock-Anthropic-Chat-API.jpg)

The AnthropicChatBedrockApi supports the `anthropic.claude-instant-v1` and `anthropic.claude-v2` models.

Also the AnthropicChatBedrockApi supports both synchronous (e.g. `chatCompletion()`) and streaming (e.g. `chatCompletionStream()`) responses.

Here is a simple snippet how to use the api programmatically:

```java
AnthropicChatBedrockApi anthropicChatApi = new AnthropicChatBedrockApi(
AnthropicModel.CLAUDE_V2.id(),
Region.EU_CENTRAL_1.id());

AnthropicChatRequest request = AnthropicChatRequest
.builder(String.format(AnthropicChatBedrockApi.PROMPT_TEMPLATE, "Name 3 famous pirates"))
.withTemperature(0.8f)
.withMaxTokensToSample(300)
.withTopK(10)
// .withStopSequences(List.of("\n\nHuman:"))
.build();

AnthropicChatResponse response = anthropicChatApi.chatCompletion(request);

System.out.println(response.completion());

// Streaming response
Flux<AnthropicChatResponse> responseStream = anthropicChatApi.chatCompletionStream(request);

List<AnthropicChatResponse> responses = responseStream.collectList().block();

System.out.println(responses);
```

Follow the [AnthropicChatBedrockApi.java](./src/main/java/org/springframework/ai/bedrock/anthropic/api/AnthropicChatBedrockApi.java)'s JavaDoc for further information.

## 1.2 BedrockAnthropicChatClient

[BedrockAnthropicChatClient](./src/main/java/org/springframework/ai/bedrock/anthropic/BedrockAnthropicChatClient.java) implements the Spring-Ai `AiClient` and `AiStreamingClient` on top of the `AnthropicChatBedrockApi`.

You can use like this:

```java
@Bean
public AnthropicChatBedrockApi anthropicApi() {
return new AnthropicChatBedrockApi(
AnthropicChatBedrockApi.AnthropicModel.CLAUDE_V2.id(),
EnvironmentVariableCredentialsProvider.create(),
Region.EU_CENTRAL_1.id(),
new ObjectMapper());
}

@Bean
public BedrockAnthropicChatClient anthropicChatClient(AnthropicChatBedrockApi anthropicApi) {
return new BedrockAnthropicChatClient(anthropicApi);
}
```

or you can leverage the `spring-ai-bedrock-ai-spring-boot-starter` Spring Boot starter:

```xml
<dependency>
<artifactId>spring-ai-bedrock-ai-spring-boot-starter</artifactId>
<groupId>org.springframework.ai</groupId>
<version>0.8.0-SNAPSHOT</version>
</dependency>
```

And set `spring.ai.bedrock.anthropic.chat.enabled=true`.
By default the client is disabled.

Use the `BedrockAnthropicChatProperties` to configure the Bedrock Llama2 Chat client:

| Property | Description | Default |
| ------------- | ------------- | ------------- |
| spring.ai.bedrock.anthropic.chat.enable | Enable Bedrock Llama2 chat client. Disabled by default | false |
| spring.ai.bedrock.anthropic.chat.awsRegion | AWS region to use. | us-east-1 |
| spring.ai.bedrock.anthropic.chat.temperature | Controls the randomness of the output. Values can range over [0.0,1.0] | 0.8 |
| spring.ai.bedrock.anthropic.chat.topP | The maximum cumulative probability of tokens to consider when sampling. | AWS Bedrock default |
| spring.ai.bedrock.anthropic.chat.maxGenLen | Specify the maximum number of tokens to use in the generated response. | 300 |
| spring.ai.bedrock.anthropic.chat.model | The model id to use. See the `Llama2ChatCompletionModel` for the supported models. | meta.llama2-70b-chat-v1 |
178 changes: 178 additions & 0 deletions spring-ai-bedrock/README_COHERE_CHAT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
# 1. Bedrock Cohere Chat

Provides Bedrock Cohere Chat clients.

## 1.1 CohereChatBedrockApi

[CohereChatBedrockApi](./src/main/java/org/springframework/ai/bedrock/cohere/api/CohereChatBedrockApi.java) provides is lightweight Java client on top of AWS Bedrock [Cohere Command models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-cohere-command.html).

Following class diagram illustrates the Llama2ChatBedrockApi interface and building blocks:

![CohereChatBedrockApi Class Diagram](./src/test/resources/doc/Bedrock%20Cohere%20Chat%20API.jpg)

The CohereChatBedrockApi supports the `cohere.command-light-text-v14` and `cohere.command-text-v14` models for bot synchronous (e.g. `chatCompletion()`) and streaming (e.g. `chatCompletionStream()`) responses.

Here is a simple snippet how to use the api programmatically:

```java
CohereChatBedrockApi cohereChatApi = new CohereChatBedrockApi(
CohereChatModel.COHERE_COMMAND_V14.id(),
Region.US_EAST_1.id());

var request = CohereChatRequest
.builder("What is the capital of Bulgaria and what is the size? What it the national anthem?")
.withStream(false)
.withTemperature(0.5f)
.withTopP(0.8f)
.withTopK(15)
.withMaxTokens(100)
.withStopSequences(List.of("END"))
.withReturnLikelihoods(CohereChatRequest.ReturnLikelihoods.ALL)
.withNumGenerations(3)
.withLogitBias(null)
.withTruncate(Truncate.NONE)
.build();

CohereChatResponse response = cohereChatApi.chatCompletion(request);

var request = CohereChatRequest
.builder("What is the capital of Bulgaria and what is the size? What it the national anthem?")
.withStream(true)
.withTemperature(0.5f)
.withTopP(0.8f)
.withTopK(15)
.withMaxTokens(100)
.withStopSequences(List.of("END"))
.withReturnLikelihoods(CohereChatRequest.ReturnLikelihoods.ALL)
.withNumGenerations(3)
.withLogitBias(null)
.withTruncate(Truncate.NONE)
.build();

Flux<CohereChatResponse.Generation> responseStream = cohereChatApi.chatCompletionStream(request);
List<CohereChatResponse.Generation> responses = responseStream.collectList().block();
```

## 1.2 BedrockCohereChatClient

[BedrockCohereChatClient](./src/main/java/org/springframework/ai/bedrock/cohere/BedrockCohereChatClient.java) implements the Spring-Ai `AiClient` and `AiStreamingClient` on top of the `CohereChatBedrockApi`.

You can use like this:

```java
@Bean
public CohereChatBedrockApi cohereApi() {
return new CohereChatBedrockApi(
CohereChatModel.COHERE_COMMAND_V14.id(),
EnvironmentVariableCredentialsProvider.create(),
Region.US_EAST_1.id(),
new ObjectMapper());
}

@Bean
public BedrockCohereChatClient cohereChatClient(CohereChatBedrockApi cohereApi) {
return new BedrockCohereChatClient(cohereApi);
}
```

or you can leverage the `spring-ai-bedrock-ai-spring-boot-starter` Boot starter. For this add the following dependency:

```xml
<dependency>
<artifactId>spring-ai-bedrock-ai-spring-boot-starter</artifactId>
<groupId>org.springframework.ai</groupId>
<version>0.8.0-SNAPSHOT</version>
</dependency>
```

**NOTE:** You have to enable the Bedrock Cohere chat client with `spring.ai.bedrock.cohere.chat.enabled=true`.
By default the client is disabled.

Use the `BedrockCohereChatProperties` to configure the Bedrock Cohere Chat client:

| Property | Description | Default |
| ------------- | ------------- | ------------- |
| spring.ai.bedrock.cohere.chat.enable | Enable Bedrock Cohere chat client. Disabled by default | false |
| spring.ai.bedrock.cohere.chat.awsRegion | AWS region to use. | us-east-1 |
| spring.ai.bedrock.cohere.chat.model | The model id to use. See the `CohereChatModel` for the supported models. | cohere.command-text-v14 |
| spring.ai.bedrock.cohere.chat.temperature | Controls the randomness of the output. Values can range over [0.0,1.0] | 0.7 |
| spring.ai.bedrock.cohere.chat.topP | The maximum cumulative probability of tokens to consider when sampling. | AWS Bedrock default |
| spring.ai.bedrock.cohere.chat.topK | Specify the number of token choices the model uses to generate the next token | AWS Bedrock default |
| spring.ai.bedrock.cohere.chat.maxTokens | Specify the maximum number of tokens to use in the generated response. | AWS Bedrock default |
| spring.ai.bedrock.cohere.chat.stopSequences | Configure up to four sequences that the model recognizes. | AWS Bedrock default |
| spring.ai.bedrock.cohere.chat.returnLikelihoods | The token likelihoods are returned with the response. | AWS Bedrock default |
| spring.ai.bedrock.cohere.chat.numGenerations | The maximum number of generations that the model should return. | AWS Bedrock default |
| spring.ai.bedrock.cohere.chat.logitBiasToken | Prevents the model from generating unwanted tokens or incentivize the model to include desired tokens. | AWS Bedrock default |
| spring.ai.bedrock.cohere.chat.logitBiasBias | Prevents the model from generating unwanted tokens or incentivize the model to include desired tokens. | AWS Bedrock default |
| spring.ai.bedrock.cohere.chat.truncate | Specifies how the API handles inputs longer than the maximum token length | AWS Bedrock default |

### 2.3 CohereEmbeddingBedrockApi

[CohereEmbeddingBedrockApi](./src/main/java/org/springframework/ai/bedrock/cohere/api/CohereEmbeddingBedrockApi.java) provides is lightweight Java client on top of AWS Bedrock [Cohere Embed models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed.html).

Following class diagram illustrates the Llama2ChatBedrockApi interface and building blocks:

![CohereEmbeddingBedrockApi Class Diagram](./src/test/resources/doc/Bedrock%20Cohere%20Embedding%20API.jpg)

The CohereEmbeddingBedrockApi supports the `cohere.embed-english-v3` and `cohere.embed-multilingual-v3` models for single and batch embedding computation.

Here is a simple snippet how to use the api programmatically:

```java
CohereEmbeddingBedrockApi api = new CohereEmbeddingBedrockApi(
CohereEmbeddingModel.COHERE_EMBED_MULTILINGUAL_V1.id(),
EnvironmentVariableCredentialsProvider.create(),
Region.US_EAST_1.id(), new ObjectMapper());

CohereEmbeddingRequest request = new CohereEmbeddingRequest(
List.of("I like to eat apples", "I like to eat oranges"),
CohereEmbeddingRequest.InputType.search_document,
CohereEmbeddingRequest.Truncate.NONE);

CohereEmbeddingResponse response = api.embedding(request);

assertThat(response.embeddings()).hasSize(2);
assertThat(response.embeddings().get(0)).hasSize(1024);
```

### 2.4 BedrockCohereEmbeddingClient

[BedrockCohereEmbeddingClient](./src/main/java/org/springframework/ai/bedrock/cohere/BedrockCohereEmbeddingClient.java) implements the Spring-Ai `EmbeddingClient` on top of the `CohereEmbeddingBedrockApi`.

You can use like this:

```java
@Bean
public CohereEmbeddingBedrockApi cohereEmbeddingApi() {
return new CohereEmbeddingBedrockApi(CohereEmbeddingModel.COHERE_EMBED_MULTILINGUAL_V1.id(),
EnvironmentVariableCredentialsProvider.create(), Region.US_EAST_1.id(), new ObjectMapper());
}

@Bean
public BedrockCohereEmbeddingClient cohereAiEmbedding(CohereEmbeddingBedrockApi cohereEmbeddingApi) {
return new BedrockCohereEmbeddingClient(cohereEmbeddingApi);
}
```

or you can leverage the `spring-ai-bedrock-ai-spring-boot-starter` Boot starter. For this add the following dependency:

```xml
<dependency>
<artifactId>spring-ai-bedrock-ai-spring-boot-starter</artifactId>
<groupId>org.springframework.ai</groupId>
<version>0.8.0-SNAPSHOT</version>
</dependency>
```

**NOTE:** You have to enable the Bedrock Cohere chat client with `spring.ai.bedrock.cohere.embedding.enabled=true`.
By default the client is disabled.

Use the `BedrockCohereEmbeddingProperties` to configure the Bedrock Cohere Chat client:

| Property | Description | Default |
| ------------- | ------------- | ------------- |
| spring.ai.bedrock.cohere.embedding.enable | Enable Bedrock Cohere chat client. Disabled by default | false |
| spring.ai.bedrock.cohere.embedding.awsRegion | AWS region to use. | us-east-1 |
| spring.ai.bedrock.cohere.embedding.model | The model id to use. See the `CohereEmbeddingModel` for the supported models. | cohere.embed-multilingual-v3 |
| spring.ai.bedrock.cohere.embedding.inputType | Prepends special tokens to differentiate each type from one another. You should not mix different types together, except when mixing types for for search and retrieval. In this case, embed your corpus with the search_document type and embedded queries with type search_query type. | search_document |
| spring.ai.bedrock.cohere.embedding.truncate | Specifies how the API handles inputs longer than the maximum token length. | NONE |
Loading