|
2 | 2 |
|
3 | 3 | This section will walk you through setting up the Chroma VectorStore to store document embeddings and perform similarity searches. |
4 | 4 |
|
5 | | -link:https://github.com/chroma-core/chroma/pkgs/container/chroma[Chroma Container] |
6 | | - |
7 | | -== What is Chroma? |
8 | | - |
9 | 5 | link:https://docs.trychroma.com/[Chroma] is the open-source embedding database. It gives you the tools to store document embeddings, content, and metadata and to search through those embeddings, including metadata filtering. |
10 | 6 |
|
11 | | -=== Prerequisites |
| 7 | +== Prerequisites |
12 | 8 |
|
13 | | -1. OpenAI Account: Create an account at link:https://platform.openai.com/signup[OpenAI Signup] and generate the token at link:https://platform.openai.com/account/api-keys[API Keys]. |
| 9 | +1. Access to ChromeDB. The <<Run Chroma Locally, setup local ChromaDB>> appendix shows how to set up a DB locally with a Docker container. |
14 | 10 |
|
15 | | -2. Access to ChromeDB. The <<Run Chroma Locally, setup local ChromaDB>> appendix shows how to set up a DB locally with a Docker container. |
| 11 | +2. `EmbeddingClient` instance to compute the document embeddings. Several options are available: |
| 12 | +- If required, an API key for the xref:api/embeddings.adoc#available-implementations[EmbeddingClient] to generate the embeddings stored by the `ChromaVectorStore`. |
16 | 13 |
|
17 | 14 | On startup, the `ChromaVectorStore` creates the required collection if one is not provisioned already. |
18 | 15 |
|
19 | | -== Configuration |
20 | | - |
21 | | -To set up ChromaVectorStore, you'll need to provide your OpenAI API Key. Set it as an environment variable like so: |
22 | | - |
23 | | -[source,bash] |
24 | | ----- |
25 | | -export SPRING_AI_OPENAI_API_KEY='Your_OpenAI_API_Key' |
26 | | ----- |
27 | | - |
28 | | -== Dependencies |
| 16 | +== Auto-configuration |
29 | 17 |
|
30 | | -Add these dependencies to your project: |
| 18 | +Spring AI provides Spring Boot auto-configuration for the Chroma Vector Sore. |
| 19 | +To enable it, add the following dependency to your project's Maven `pom.xml` file: |
31 | 20 |
|
32 | | -* OpenAI: Required for calculating embeddings. |
33 | | - |
34 | | -[source,xml] |
| 21 | +[source, xml] |
35 | 22 | ---- |
36 | 23 | <dependency> |
37 | | - <groupId>org.springframework.ai</groupId> |
38 | | - <artifactId>spring-ai-openai-spring-boot-starter</artifactId> |
| 24 | + <groupId>org.springframework.ai</groupId> |
| 25 | + <artifactId>spring-ai-chroma-store-spring-boot-starter</artifactId> |
39 | 26 | </dependency> |
40 | 27 | ---- |
41 | 28 |
|
42 | | -* Chroma VectorStore. |
| 29 | +or to your Gradle `build.gradle` build file. |
43 | 30 |
|
44 | | -[source,xml] |
| 31 | +[source,groovy] |
45 | 32 | ---- |
46 | | -<dependency> |
47 | | - <groupId>org.springframework.ai</groupId> |
48 | | - <artifactId>spring-ai-chroma-store</artifactId> |
49 | | -</dependency> |
| 33 | +dependencies { |
| 34 | + implementation 'org.springframework.ai:spring-ai-chroma-store-spring-boot-starter' |
| 35 | +} |
50 | 36 | ---- |
51 | 37 |
|
52 | 38 | TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file. |
53 | 39 |
|
54 | | -== Sample Code |
| 40 | +TIP: Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file. |
55 | 41 |
|
56 | | -Create a `RestTemplate` instance with proper ChromaDB authorization configurations and Use it to create a `ChromaApi` instance: |
| 42 | +Additionally, you will need a configured `EmbeddingClient` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingClient] section for more information. |
| 43 | + |
| 44 | +Here is an example of the needed bean: |
57 | 45 |
|
58 | 46 | [source,java] |
59 | 47 | ---- |
60 | 48 | @Bean |
61 | | -public RestTemplate restTemplate() { |
62 | | - return new RestTemplate(); |
63 | | -} |
64 | | -
|
65 | | -@Bean |
66 | | -public ChromaApi chromaApi(RestTemplate restTemplate) { |
67 | | - String chromaUrl = "http://localhost:8000"; |
68 | | - ChromaApi chromaApi = new ChromaApi(chromaUrl, restTemplate); |
69 | | - return chromaApi; |
| 49 | +public EmbeddingClient embeddingClient() { |
| 50 | + // Can be any other EmbeddingClient implementation. |
| 51 | + return new OpenAiEmbeddingClient(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY"))); |
70 | 52 | } |
71 | 53 | ---- |
72 | 54 |
|
73 | | -[NOTE] |
74 | | -==== |
75 | | -For ChromaDB secured with link:https://docs.trychroma.com/usage-guide#static-api-token-authentication[Static API Token Authentication] use the `ChromaApi#withKeyToken(<Your Token Credentials>)` method to set your credentials. Check the `ChromaWhereIT` for an example. |
| 55 | +To connect to Chroma you need to provide access details for your instance. |
| 56 | +A simple configuration can either be provided via Spring Boot's _application.properties_, |
76 | 57 |
|
77 | | -For ChromaDB secured with link:https://docs.trychroma.com/usage-guide#basic-authentication[Basic Authentication] use the `ChromaApi#withBasicAuth(<your user>, <your password>)` method to set your credentials. Check the `BasicAuthChromaWhereIT` for an example. |
78 | | -==== |
| 58 | +[source,properties] |
| 59 | +---- |
| 60 | +# Chroma Vector Store connection properties |
| 61 | +spring.ai.vectorstore.chroma.client.host=<your Chroma instance host> |
| 62 | +spring.ai.vectorstore.chroma.client.port=<your Chroma instance port> |
| 63 | +spring.ai.vectorstore.chroma.client.key-token=<your access token (if configure)> |
| 64 | +spring.ai.vectorstore.chroma.client.username=<your username (if configure)> |
| 65 | +spring.ai.vectorstore.chroma.client.password=<your password (if configure)> |
79 | 66 |
|
80 | | -Integrate with OpenAI's embeddings by adding the Spring Boot OpenAI starter to your project. This provides you with an implementation of the Embeddings client: |
| 67 | +# Chroma Vector Store collection properties |
| 68 | +spring.ai.vectorstore.chroma.store.collection-name=<your collection name> |
81 | 69 |
|
82 | | -[source,java] |
83 | | ----- |
84 | | -@Bean |
85 | | -public VectorStore chromaVectorStore(EmbeddingClient embeddingClient, ChromaApi chromaApi) { |
86 | | - return new ChromaVectorStore(embeddingClient, chromaApi, "TestCollection"); |
87 | | -} |
| 70 | +# Chroma Vector Store configuration properties |
| 71 | +
|
| 72 | +# OpenAI API key if the OpenAI auto-configuration is used. |
| 73 | +spring.ai.openai.api.key=<OpenAI Api-key> |
88 | 74 | ---- |
89 | 75 |
|
90 | | -In your main code, create some documents: |
| 76 | +Please have a look at the list of xref:#_configuration_properties[configuration parameters] for the vector store to learn about the default values and configuration options. |
| 77 | + |
| 78 | +Now you can Auto-wire the Chroma Vector Store in your application and use it |
91 | 79 |
|
92 | 80 | [source,java] |
93 | 81 | ---- |
94 | | -List<Document> documents = List.of( |
95 | | - new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")), |
96 | | - new Document("The World is Big and Salvation Lurks Around the Corner"), |
97 | | - new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2"))); |
98 | | ----- |
| 82 | +@Autowired VectorStore vectorStore; |
99 | 83 |
|
100 | | -Add the documents to your vector store: |
| 84 | +// ... |
101 | 85 |
|
102 | | -[source,java] |
103 | | ----- |
104 | | -vectorStore.add(documents); |
105 | | ----- |
| 86 | +List <Document> documents = List.of( |
| 87 | + new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")), |
| 88 | + new Document("The World is Big and Salvation Lurks Around the Corner"), |
| 89 | + new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2"))); |
106 | 90 |
|
107 | | -And finally, retrieve documents similar to a query: |
| 91 | +// Add the documents |
| 92 | +vectorStore.add(List.of(document)); |
108 | 93 |
|
109 | | -[source,java] |
110 | | ----- |
111 | | -List<Document> results = vectorStore.similaritySearch("Spring"); |
| 94 | +// Retrieve documents similar to a query |
| 95 | +List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5)); |
112 | 96 | ---- |
113 | 97 |
|
114 | | -If all goes well, you should retrieve the document containing the text "Spring AI rocks!!". |
| 98 | +=== Configuration properties |
| 99 | + |
| 100 | +You can use the following properties in your Spring Boot configuration to customize the vector store. |
115 | 101 |
|
116 | | -=== Metadata filtering |
| 102 | +|=== |
| 103 | +|Property| Description | Default value |
| 104 | + |
| 105 | +|`spring.ai.vectorstore.chroma.client.host`| Server connection host | `http://localhost` |
| 106 | +|`spring.ai.vectorstore.chroma.client.port`| Server connection port | `8000` |
| 107 | +|`spring.ai.vectorstore.chroma.client.key-token`| Access token (if configured) | - |
| 108 | +|`spring.ai.vectorstore.chroma.client.username`| Access username (if configured) | - |
| 109 | +|`spring.ai.vectorstore.chroma.client.password`| Access password (if configured) | - |
| 110 | +|`spring.ai.vectorstore.chroma.store.collection-name`| Collection name | `SpringAiCollection` |
| 111 | +|=== |
| 112 | + |
| 113 | +[NOTE] |
| 114 | +==== |
| 115 | +For ChromaDB secured with link:https://docs.trychroma.com/usage-guide#static-api-token-authentication[Static API Token Authentication] use the `ChromaApi#withKeyToken(<Your Token Credentials>)` method to set your credentials. Check the `ChromaWhereIT` for an example. |
| 116 | +
|
| 117 | +For ChromaDB secured with link:https://docs.trychroma.com/usage-guide#basic-authentication[Basic Authentication] use the `ChromaApi#withBasicAuth(<your user>, <your password>)` method to set your credentials. Check the `BasicAuthChromaWhereIT` for an example. |
| 118 | +==== |
| 119 | + |
| 120 | +== Metadata filtering |
117 | 121 |
|
118 | 122 | You can leverage the generic, portable link:https://docs.spring.io/spring-ai/reference/api/vectordbs.html#_metadata_filters[metadata filters] with ChromaVector store as well. |
119 | 123 |
|
@@ -161,6 +165,91 @@ is converted into the proprietary Chroma format |
161 | 165 | } |
162 | 166 | ``` |
163 | 167 |
|
| 168 | + |
| 169 | +== Manual Configuration |
| 170 | + |
| 171 | +If you prefer to configure the Chroma Vector Store manually, you can do so by creating a `ChromaVectorStore` bean in your Spring Boot application. |
| 172 | + |
| 173 | +Add these dependencies to your project: |
| 174 | +* Chroma VectorStore. |
| 175 | + |
| 176 | +[source,xml] |
| 177 | +---- |
| 178 | +<dependency> |
| 179 | + <groupId>org.springframework.ai</groupId> |
| 180 | + <artifactId>spring-ai-chroma-store</artifactId> |
| 181 | +</dependency> |
| 182 | +---- |
| 183 | + |
| 184 | +* OpenAI: Required for calculating embeddings. You can use any other embedding client implementation. |
| 185 | + |
| 186 | +[source,xml] |
| 187 | +---- |
| 188 | +<dependency> |
| 189 | + <groupId>org.springframework.ai</groupId> |
| 190 | + <artifactId>spring-ai-openai-spring-boot-starter</artifactId> |
| 191 | +</dependency> |
| 192 | +---- |
| 193 | + |
| 194 | + |
| 195 | +TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file. |
| 196 | + |
| 197 | +=== Sample Code |
| 198 | + |
| 199 | +Create a `RestTemplate` instance with proper ChromaDB authorization configurations and Use it to create a `ChromaApi` instance: |
| 200 | + |
| 201 | +[source,java] |
| 202 | +---- |
| 203 | +@Bean |
| 204 | +public RestTemplate restTemplate() { |
| 205 | + return new RestTemplate(); |
| 206 | +} |
| 207 | +
|
| 208 | +@Bean |
| 209 | +public ChromaApi chromaApi(RestTemplate restTemplate) { |
| 210 | + String chromaUrl = "http://localhost:8000"; |
| 211 | + ChromaApi chromaApi = new ChromaApi(chromaUrl, restTemplate); |
| 212 | + return chromaApi; |
| 213 | +} |
| 214 | +---- |
| 215 | + |
| 216 | +Integrate with OpenAI's embeddings by adding the Spring Boot OpenAI starter to your project. This provides you with an implementation of the Embeddings client: |
| 217 | + |
| 218 | +[source,java] |
| 219 | +---- |
| 220 | +@Bean |
| 221 | +public VectorStore chromaVectorStore(EmbeddingClient embeddingClient, ChromaApi chromaApi) { |
| 222 | + return new ChromaVectorStore(embeddingClient, chromaApi, "TestCollection"); |
| 223 | +} |
| 224 | +---- |
| 225 | + |
| 226 | +In your main code, create some documents: |
| 227 | + |
| 228 | +[source,java] |
| 229 | +---- |
| 230 | +List<Document> documents = List.of( |
| 231 | + new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")), |
| 232 | + new Document("The World is Big and Salvation Lurks Around the Corner"), |
| 233 | + new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2"))); |
| 234 | +---- |
| 235 | + |
| 236 | +Add the documents to your vector store: |
| 237 | + |
| 238 | +[source,java] |
| 239 | +---- |
| 240 | +vectorStore.add(documents); |
| 241 | +---- |
| 242 | + |
| 243 | +And finally, retrieve documents similar to a query: |
| 244 | + |
| 245 | +[source,java] |
| 246 | +---- |
| 247 | +List<Document> results = vectorStore.similaritySearch("Spring"); |
| 248 | +---- |
| 249 | + |
| 250 | +If all goes well, you should retrieve the document containing the text "Spring AI rocks!!". |
| 251 | + |
| 252 | + |
164 | 253 | === Run Chroma Locally |
165 | 254 |
|
166 | 255 | ```shell |
|
0 commit comments