Skip to content

Commit 37a20da

Browse files
markpollackleijendary
authored andcommitted
Update VectorStore docs to use inner builder class
Updated the manual configuration examples in the following docs to show the correct usage of the inner builder class: - azure.adoc: Show builder(searchIndexClient, embeddingModel) with all available options - chroma.adoc: Show builder(chromaApi, embeddingModel) with collection config - oracle.adoc: Show builder(jdbcTemplate, embeddingModel) with database options The examples now reflect the current implementation where the builder takes both the client and embedding model as constructor arguments. Signed-off-by: leijendary <[email protected]>
1 parent d52c753 commit 37a20da

File tree

6 files changed

+128
-191
lines changed

6 files changed

+128
-191
lines changed

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/apache-cassandra.adoc

Lines changed: 61 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -97,90 +97,109 @@ You can use the following properties in your Spring Boot configuration to custom
9797

9898
=== Basic Usage
9999

100-
Create a CassandraVectorStore instance using the builder pattern:
100+
Create a CassandraVectorStore instance as a Spring Bean:
101101

102102
[source,java]
103103
----
104104
@Bean
105105
public VectorStore vectorStore(CqlSession session, EmbeddingModel embeddingModel) {
106-
return CassandraVectorStore.builder()
106+
return CassandraVectorStore.builder(embeddingModel)
107107
.session(session)
108-
.embeddingModel(embeddingModel)
109108
.keyspace("my_keyspace")
110109
.table("my_vectors")
111110
.build();
112111
}
113112
----
114113

115-
[NOTE]
116-
====
117-
The default configuration connects to Cassandra at `localhost:9042` and will automatically create a default schema in keyspace `springframework`, table `ai_vector_store`.
118-
====
119-
120-
[NOTE]
121-
====
122-
The Cassandra Java Driver is easiest configured via an `application.conf` file on the classpath. More info https://github.com/apache/cassandra-java-driver/tree/4.x/manual/core/configuration[here].
123-
====
124-
125-
Then in your main code, create and add some documents:
126-
127-
[source,java]
128-
----
129-
List<Document> documents = List.of(
130-
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!",
131-
Map.of("country", "UK", "year", 2020)),
132-
new Document("The World is Big and Salvation Lurks Around the Corner",
133-
Map.of()),
134-
new Document("You walk forward facing the past and you turn back toward the future.",
135-
Map.of("country", "NL", "year", 2023)));
136-
137-
vectorStore.add(documents);
138-
----
139-
140-
And retrieve documents similar to a query:
114+
Once you have the vector store instance, you can add documents and perform searches:
141115

142116
[source,java]
143117
----
144-
List<Document> results = vectorStore.similaritySearch(
145-
SearchRequest.builder().query("Spring").topK(5).build());
146-
----
118+
// Add documents
119+
vectorStore.add(List.of(
120+
new Document("1", "content1", Map.of("key1", "value1")),
121+
new Document("2", "content2", Map.of("key2", "value2"))
122+
));
147123
148-
You can also limit results based on a similarity threshold:
149-
150-
[source,java]
151-
----
124+
// Search with filters
152125
List<Document> results = vectorStore.similaritySearch(
153-
SearchRequest.builder().query("Spring")
154-
.topK(5)
155-
.similarityThreshold(0.5d).build());
126+
SearchRequest.query("search text")
127+
.withTopK(5)
128+
.withSimilarityThreshold(0.7f)
129+
.withFilterExpression("metadata.key1 == 'value1'")
130+
);
156131
----
157132

158133
=== Advanced Configuration
159134

160-
For more complex scenarios, the builder pattern offers extensive configuration options:
135+
For more complex use cases, you can configure additional settings in your Spring Bean:
161136

162137
[source,java]
163138
----
164139
@Bean
165140
public VectorStore vectorStore(CqlSession session, EmbeddingModel embeddingModel) {
166141
return CassandraVectorStore.builder(embeddingModel)
142+
.session(session)
167143
.keyspace("my_keyspace")
168144
.table("my_vectors")
169-
.partitionKeys(List.of(new SchemaColumn("id", DataTypes.TEXT)))
170-
.clusteringKeys(List.of(new SchemaColumn("timestamp", DataTypes.TIMESTAMP)))
145+
// Configure primary keys
146+
.partitionKeys(List.of(
147+
new SchemaColumn("id", DataTypes.TEXT),
148+
new SchemaColumn("category", DataTypes.TEXT)
149+
))
150+
.clusteringKeys(List.of(
151+
new SchemaColumn("timestamp", DataTypes.TIMESTAMP)
152+
))
153+
// Add metadata columns with optional indexing
171154
.addMetadataColumns(
172155
new SchemaColumn("category", DataTypes.TEXT, SchemaColumnTags.INDEXED),
173156
new SchemaColumn("score", DataTypes.DOUBLE)
174157
)
158+
// Customize column names
175159
.contentColumnName("text")
176160
.embeddingColumnName("vector")
161+
// Performance tuning
177162
.fixedThreadPoolExecutorSize(32)
163+
// Schema management
178164
.disallowSchemaChanges(false)
165+
// Custom batching strategy
179166
.batchingStrategy(new TokenCountBatchingStrategy())
180167
.build();
181168
}
182169
----
183170

171+
=== Connection Configuration
172+
173+
There are two ways to configure the connection to Cassandra:
174+
175+
* Using an injected CqlSession (recommended):
176+
177+
[source,java]
178+
----
179+
@Bean
180+
public VectorStore vectorStore(CqlSession session, EmbeddingModel embeddingModel) {
181+
return CassandraVectorStore.builder(embeddingModel)
182+
.session(session)
183+
.keyspace("my_keyspace")
184+
.table("my_vectors")
185+
.build();
186+
}
187+
----
188+
189+
* Using connection details directly in the builder:
190+
191+
[source,java]
192+
----
193+
@Bean
194+
public VectorStore vectorStore(EmbeddingModel embeddingModel) {
195+
return CassandraVectorStore.builder(embeddingModel)
196+
.contactPoint(new InetSocketAddress("localhost", 9042))
197+
.localDatacenter("datacenter1")
198+
.keyspace("my_keyspace")
199+
.build();
200+
}
201+
----
202+
184203
=== Metadata Filtering
185204

186205
You can leverage the generic, portable metadata filters with the CassandraVectorStore. For metadata columns to be searchable they must be either primary keys or SAI indexed. To make non-primary-key columns indexed, configure the metadata column with the `SchemaColumnTags.INDEXED`.

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/azure-cosmos-db.adoc

Lines changed: 42 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -148,84 +148,51 @@ List<Document> results = vectorStore.similaritySearch(SearchRequest.builder().qu
148148

149149
The following code demonstrates how to set up the `CosmosDBVectorStore` without relying on auto-configuration:
150150

151-
```java
152-
package com.example.demo;
153-
154-
import com.azure.cosmos.CosmosAsyncClient;
155-
import com.azure.cosmos.CosmosClientBuilder;
156-
import io.micrometer.observation.ObservationRegistry;
157-
import org.springframework.ai.document.Document;
158-
import org.springframework.ai.embedding.EmbeddingModel;
159-
import org.springframework.ai.transformers.TransformersEmbeddingModel;
160-
import org.springframework.ai.vectorstore.cosmosdb.CosmosDBVectorStore;
161-
import org.springframework.ai.vectorstore.CosmosDBVectorStoreConfig;
162-
import org.springframework.ai.vectorstore.VectorStore;
163-
import org.springframework.beans.factory.annotation.Autowired;
164-
import org.springframework.boot.CommandLineRunner;
165-
import org.springframework.boot.SpringApplication;
166-
import org.springframework.boot.autoconfigure.SpringBootApplication;
167-
import org.springframework.context.annotation.Bean;
168-
import org.springframework.context.annotation.Lazy;
169-
170-
import java.util.List;
171-
import java.util.Map;
172-
import java.util.UUID;
173-
174-
@SpringBootApplication
175-
public class DemoApplication implements CommandLineRunner {
176-
177-
@Lazy
178-
@Autowired
179-
private VectorStore vectorStore;
180-
181-
@Lazy
182-
@Autowired
183-
private EmbeddingModel embeddingModel;
184-
185-
public static void main(String[] args) {
186-
SpringApplication.run(DemoApplication.class, args);
187-
}
188-
189-
@Override
190-
public void run(String... args) throws Exception {
191-
Document document1 = new Document(UUID.randomUUID().toString(), "Sample content1", Map.of("key1", "value1"));
192-
Document document2 = new Document(UUID.randomUUID().toString(), "Sample content2", Map.of("key2", "value2"));
193-
this.vectorStore.add(List.of(document1, document2));
194-
195-
List<Document> results = this.vectorStore.similaritySearch(SearchRequest.builder().query("Sample content").topK(1).build());
196-
log.info("Search results: {}", results);
197-
}
151+
[source,java]
152+
----
153+
@Bean
154+
public VectorStore vectorStore(ObservationRegistry observationRegistry) {
155+
// Create the Cosmos DB client
156+
CosmosAsyncClient cosmosClient = new CosmosClientBuilder()
157+
.endpoint(System.getenv("COSMOSDB_AI_ENDPOINT"))
158+
.key(System.getenv("COSMOSDB_AI_KEY"))
159+
.userAgentSuffix("SpringAI-CDBNoSQL-VectorStore")
160+
.gatewayMode()
161+
.buildAsyncClient();
162+
163+
// Create and configure the vector store
164+
return CosmosDBVectorStore.builder(cosmosClient, embeddingModel)
165+
.databaseName("test-database")
166+
.containerName("test-container")
167+
// Configure metadata fields for filtering
168+
.metadataFields(List.of("country", "year", "city"))
169+
// Set the partition key path (optional)
170+
.partitionKeyPath("/id")
171+
// Configure performance settings
172+
.vectorStoreThroughput(1000)
173+
.vectorDimensions(1536) // Match your embedding model's dimensions
174+
// Add custom batching strategy (optional)
175+
.batchingStrategy(new TokenCountBatchingStrategy())
176+
// Add observation registry for metrics
177+
.observationRegistry(observationRegistry)
178+
.build();
179+
}
198180
199-
@Bean
200-
public ObservationRegistry observationRegistry() {
201-
return ObservationRegistry.create();
202-
}
181+
@Bean
182+
public EmbeddingModel embeddingModel() {
183+
return new TransformersEmbeddingModel();
184+
}
185+
----
203186

204-
@Bean
205-
public VectorStore vectorStore(ObservationRegistry observationRegistry) {
206-
207-
CosmosAsyncClient cosmosClient = new CosmosClientBuilder()
208-
.endpoint(System.getenv("COSMOSDB_AI_ENDPOINT"))
209-
.userAgentSuffix("SpringAI-CDBNoSQL-VectorStore")
210-
.key(System.getenv("COSMOSDB_AI_KEY"))
211-
.gatewayMode()
212-
.buildAsyncClient();
213-
214-
return CosmosDBVectorStore.builder(cosmosClient, this.embeddingModel)
215-
.databaseName("test-database")
216-
.containerName("test-container")
217-
.metadataFields(List.of("country", "year", "city"))
218-
.vectorStoreThroughput(1000)
219-
.observationRegistry(observationRegistry)
220-
.build();
221-
}
187+
This configuration shows all the available builder options:
222188

223-
@Bean
224-
public EmbeddingModel embeddingModel() {
225-
return new TransformersEmbeddingModel();
226-
}
227-
}
228-
```
189+
* `databaseName`: The name of your Cosmos DB database
190+
* `containerName`: The name of your container within the database
191+
* `partitionKeyPath`: The path for the partition key (e.g., "/id")
192+
* `metadataFields`: List of metadata fields that will be used for filtering
193+
* `vectorStoreThroughput`: The throughput (RU/s) for the vector store container
194+
* `vectorDimensions`: The number of dimensions for your vectors (should match your embedding model)
195+
* `batchingStrategy`: Strategy for batching document operations (optional)
229196

230197
== Manual Dependency Setup
231198

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/azure.adoc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,9 @@ public VectorStore vectorStore(SearchIndexClient searchIndexClient, EmbeddingMod
138138
// in the similarity search filters.
139139
.filterMetadataFields(List.of(MetadataField.text("country"), MetadataField.int64("year"),
140140
MetadataField.date("activationDate")))
141+
.defaultTopK(5)
142+
.defaultSimilarityThreshold(0.7)
143+
.indexName("spring-ai-document-index")
141144
.build();
142145
}
143146
----

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/chroma.adoc

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -232,9 +232,9 @@ Integrate with OpenAI's embeddings by adding the Spring Boot OpenAI starter to y
232232
@Bean
233233
public VectorStore chromaVectorStore(EmbeddingModel embeddingModel, ChromaApi chromaApi) {
234234
return ChromaVectorStore.builder(chromaApi, embeddingModel)
235-
.collectionName("TestCollection")
236-
.initializeSchema(true)
237-
.build();
235+
.collectionName("TestCollection")
236+
.initializeSchema(true)
237+
.build();
238238
}
239239
----
240240

@@ -272,4 +272,3 @@ docker run -it --rm --name chroma -p 8000:8000 ghcr.io/chroma-core/chroma:0.5.20
272272
```
273273

274274
Starts a chroma store at <http://localhost:8000/api/v1>
275-

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/oracle.adoc

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,14 @@ To configure the `OracleVectorStore` in your application, you can use the follow
184184
----
185185
@Bean
186186
public VectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingModel embeddingModel) {
187-
return new OracleVectorStore(jdbcTemplate, embeddingModel, true);
187+
return OracleVectorStore.builder(jdbcTemplate, embeddingModel)
188+
.tableName("my_vectors")
189+
.indexType(OracleVectorStoreIndexType.IVF)
190+
.distanceType(OracleVectorStoreDistanceType.COSINE)
191+
.dimensions(1536)
192+
.searchAccuracy(95)
193+
.initializeSchema(true)
194+
.build();
188195
}
189196
----
190197

@@ -199,5 +206,3 @@ You can then connect to the database using:
199206
----
200207
sql mlops/mlops@localhost/freepdb1
201208
----
202-
203-

0 commit comments

Comments
 (0)