Skip to content

Commit 8877672

Browse files
sobychackomarkpollack
authored andcommitted
Add builder pattern to TypesenseVectorStore and refactor package name
Introduces a builder pattern for configuring TypesenseVectorStore instances and moves the implementation to the org.springframework.ai.vectorstore.typesense package. This change: - Makes configuration more flexible and type-safe through builder methods - Improves code organization by moving to a dedicated vector store package - Deprecates old constructors in favor of the builder pattern - Adds comprehensive validation of configuration options - Enhances documentation with clear usage examples - Adds dedicated builder test class for better test coverage - Add builder tests - update reference docs The builder pattern simplifies TypesenseVectorStore configuration while ensuring proper validation of all settings. The package move aligns with Spring AI's architectural patterns and improves maintainability by grouping related classes together. review
1 parent b25d6e8 commit 8877672

File tree

9 files changed

+515
-193
lines changed

9 files changed

+515
-193
lines changed

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/typesense.adoc

Lines changed: 124 additions & 135 deletions
Original file line numberDiff line numberDiff line change
@@ -2,27 +2,25 @@
22

33
This section walks you through setting up `TypesenseVectorStore` to store document embeddings and perform similarity searches.
44

5-
link:https://typesense.org[Typesense] Typesense is an open source, typo tolerant search engine that is optimized for instant sub-50ms searches, while providing an intuitive developer experience.
5+
link:https://typesense.org[Typesense] is an open source typo tolerant search engine that is optimized for instant sub-50ms searches while providing an intuitive developer experience. It provides vector search capabilities that allow you to store and query high-dimensional vectors alongside your regular search data.
66

77
== Prerequisites
88

9-
1. A Typesense instance
10-
- link:https://typesense.org/docs/guide/install-typesense.html[Typesense Cloud] (recommended)
11-
- link:https://hub.docker.com/r/typesense/typesense/[Docker] image _typesense/typesense:latest_
12-
13-
2. `EmbeddingModel` instance to compute the document embeddings. Several options are available:
14-
- If required, an API key for the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] to generate the embeddings stored by the `TypesenseVectorStore`.
9+
* A running Typesense instance. The following options are available:
10+
** link:https://typesense.org/docs/guide/install-typesense.html[Typesense Cloud] (recommended)
11+
** link:https://hub.docker.com/r/typesense/typesense/[Docker] image _typesense/typesense:latest_
12+
* If required, an API key for the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] to generate the embeddings stored by the `TypesenseVectorStore`.
1513

1614
== Auto-configuration
1715

18-
Spring AI provides Spring Boot auto-configuration for the Typesense Vector Sore.
19-
To enable it, add the following dependency to your project's Maven `pom.xml` file:
16+
Spring AI provides Spring Boot auto-configuration for the Typesense Vector Store.
17+
To enable it add the following dependency to your project's Maven `pom.xml` file:
2018

21-
[source, xml]
19+
[source,xml]
2220
----
2321
<dependency>
24-
<groupId>org.springframework.ai</groupId>
25-
<artifactId>spring-ai-typesense-spring-boot-starter</artifactId>
22+
<groupId>org.springframework.ai</groupId>
23+
<artifactId>spring-ai-typesense-spring-boot-starter</artifactId>
2624
</dependency>
2725
----
2826

@@ -37,50 +35,23 @@ dependencies {
3735

3836
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
3937

40-
TIP: Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file.
41-
42-
Additionally, you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
43-
44-
Here is an example of the needed bean:
45-
46-
[source,java]
47-
----
48-
@Bean
49-
public EmbeddingModel embeddingModel() {
50-
// Can be any other EmbeddingModel implementation.
51-
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY")));
52-
}
53-
----
38+
Please have a look at the list of xref:#_configuration_properties[configuration parameters] for the vector store to learn about the default values and configuration options.
5439

55-
To connect to Typesense you need to provide access details for your instance.
56-
A simple configuration can either be provided via Spring Boot's _application.yml_,
40+
TIP: Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file.
5741

58-
[source,yaml]
59-
----
60-
spring:
61-
ai:
62-
vectorstore:
63-
typesense:
64-
collectionName: "vector_store"
65-
embeddingDimension: 1536
66-
client:
67-
protocl: http
68-
host: localhost
69-
port: 8108
70-
apiKey: xyz
71-
----
42+
The vector store implementation can initialize the requisite schema for you but you must opt-in by setting `...initialize-schema=true` in the `application.properties` file.
7243

73-
Please have a look at the list of xref:#_configuration_properties[configuration parameters] for the vector store to learn about the default values and configuration options.
44+
Additionally you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
7445

75-
Now you can Auto-wire the Typesense Vector Store in your application and use it
46+
Now you can auto-wire the `TypesenseVectorStore` as a vector store in your application:
7647

7748
[source,java]
7849
----
7950
@Autowired VectorStore vectorStore;
8051
8152
// ...
8253
83-
List <Document> documents = List.of(
54+
List<Document> documents = List.of(
8455
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
8556
new Document("The World is Big and Salvation Lurks Around the Corner"),
8657
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
@@ -89,156 +60,174 @@ List <Document> documents = List.of(
8960
vectorStore.add(documents);
9061
9162
// Retrieve documents similar to a query
92-
List<Document> results = this.vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
63+
List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
9364
----
9465

95-
=== Configuration properties
66+
=== Configuration Properties
9667

97-
You can use the following properties in your Spring Boot configuration to customize the Typesense vector store.
68+
To connect to Typesense and use the `TypesenseVectorStore` you need to provide access details for your instance.
69+
A simple configuration can be provided via Spring Boot's `application.yml`:
9870

99-
[stripes=even]
100-
|===
101-
|Property| Description | Default value
71+
[source,yaml]
72+
----
73+
spring:
74+
ai:
75+
vectorstore:
76+
typesense:
77+
initialize-schema: true
78+
collection-name: vector_store
79+
embedding-dimension: 1536
80+
client:
81+
protocol: http
82+
host: localhost
83+
port: 8108
84+
api-key: xyz
85+
----
10286

103-
|`spring.ai.vectorstore.typesense.client.protocol`| HTTP Protocol | `http`
104-
|`spring.ai.vectorstore.typesense.client.host`| Hostname | `localhost`
105-
|`spring.ai.vectorstore.typesense.client.port`| Port | `8108`
106-
|`spring.ai.vectorstore.typesense.client.apiKey`| ApiKey | `xyz`
107-
|`spring.ai.vectorstore.typesense.initialize-schema`| Whether to initialize the required schema | `false`
108-
|`spring.ai.vectorstore.typesense.collection-name`| Collection Name | `vector_store`
109-
|`spring.ai.vectorstore.typesense.embedding-dimension`| Embedding Dimension | `1536`
87+
Properties starting with `spring.ai.vectorstore.typesense.*` are used to configure the `TypesenseVectorStore`:
11088

89+
[cols="2,5,1",stripes=even]
11190
|===
91+
|Property |Description |Default Value
11292

113-
== Metadata filtering
93+
|`spring.ai.vectorstore.typesense.initialize-schema`
94+
|Whether to initialize the required schema
95+
|`false`
11496

115-
You can leverage the generic, portable link:https://docs.spring.io/spring-ai/reference/api/vectordbs.html#_metadata_filters[metadata filters] with `TypesenseVectorStore` as well.
97+
|`spring.ai.vectorstore.typesense.collection-name`
98+
|The name of the collection to store vectors
99+
|`vector_store`
116100

117-
For example, you can use either the text expression language:
101+
|`spring.ai.vectorstore.typesense.embedding-dimension`
102+
|The number of dimensions in the vector
103+
|`1536`
118104

119-
[source,java]
120-
----
121-
vectorStore.similaritySearch(
122-
SearchRequest
123-
.query("The World")
124-
.withTopK(TOP_K)
125-
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
126-
.withFilterExpression("country in ['UK', 'NL'] && year >= 2020"));
127-
----
105+
|`spring.ai.vectorstore.typesense.client.protocol`
106+
|HTTP Protocol
107+
|`http`
128108

129-
or programmatically using the expression DSL:
109+
|`spring.ai.vectorstore.typesense.client.host`
110+
|Hostname
111+
|`localhost`
130112

131-
[source,java]
132-
----
133-
FilterExpressionBuilder b = new FilterExpressionBuilder();
113+
|`spring.ai.vectorstore.typesense.client.port`
114+
|Port
115+
|`8108`
134116

135-
vectorStore.similaritySearch(
136-
SearchRequest
137-
.query("The World")
138-
.withTopK(TOP_K)
139-
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
140-
.withFilterExpression(b.and(
141-
b.in("country", "UK", "NL"),
142-
b.gte("year", 2020)).build()));
143-
----
117+
|`spring.ai.vectorstore.typesense.client.api-key`
118+
|API Key
119+
|`xyz`
120+
|===
144121

145-
The portable filter expressions get automatically converted into link:https://typesense.org/docs/0.24.0/api/search.html#filter-parameters[Typesense Search Filters].
146-
For example, the following portable filter expression:
122+
== Manual Configuration
147123

148-
[source,sql]
124+
Instead of using the Spring Boot auto-configuration you can manually configure the Typesense vector store. For this you need to add the `spring-ai-typesense-store` to your project:
125+
126+
[source,xml]
149127
----
150-
country in ['UK', 'NL'] && year >= 2020
128+
<dependency>
129+
<groupId>org.springframework.ai</groupId>
130+
<artifactId>spring-ai-typesense-store</artifactId>
131+
</dependency>
151132
----
152133

153-
is converted into Typesense filter:
134+
or to your Gradle `build.gradle` build file.
154135

155-
[source]
136+
[source,groovy]
156137
----
157-
country: ['UK', 'NL'] && year: >=2020
138+
dependencies {
139+
implementation 'org.springframework.ai:spring-ai-typesense-store'
140+
}
158141
----
159142

160-
== Manual configuration
143+
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
161144

162-
If you prefer not to use the auto-configuration, you can manually configure the Typesense Vector Store.
163-
Add the Typesense Vector Store and Jedis dependencies
145+
Create a Typesense `Client` bean:
164146

165-
[source,xml]
147+
[source,java]
166148
----
167-
<dependency>
168-
<groupId>org.springframework.ai</groupId>
169-
<artifactId>spring-ai-typesense</artifactId>
170-
</dependency>
149+
@Bean
150+
public Client typesenseClient() {
151+
List<Node> nodes = new ArrayList<>();
152+
nodes.add(new Node("http", "localhost", "8108"));
153+
Configuration configuration = new Configuration(nodes, Duration.ofSeconds(5), "xyz");
154+
return new Client(configuration);
155+
}
171156
----
172157

173-
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
174-
175-
Then, create a `TypesenseVectorStore` bean in your Spring configuration:
158+
Then create the `TypesenseVectorStore` bean using the builder pattern:
176159

177160
[source,java]
178161
----
179162
@Bean
180163
public VectorStore vectorStore(Client client, EmbeddingModel embeddingModel) {
181-
182-
TypesenseVectorStoreConfig config = TypesenseVectorStoreConfig.builder()
183-
.withCollectionName("test_vector_store")
184-
.withEmbeddingDimension(embeddingModel.dimensions())
164+
return TypesenseVectorStore.builder()
165+
.client(client)
166+
.embeddingModel(embeddingModel)
167+
.collectionName("custom_vectors") // Optional: defaults to "vector_store"
168+
.embeddingDimension(1536) // Optional: defaults to 1536
169+
.initializeSchema(true) // Optional: defaults to false
170+
.batchingStrategy(new TokenCountBatchingStrategy()) // Optional: defaults to TokenCountBatchingStrategy
185171
.build();
186-
187-
return new TypesenseVectorStore(client, embeddingModel, config);
188172
}
189173
174+
// This can be any EmbeddingModel implementation
190175
@Bean
191-
public Client typesenseClient() {
192-
List<Node> nodes = new ArrayList<>();
193-
nodes
194-
.add(new Node("http", typesenseContainer.getHost(), typesenseContainer.getMappedPort(8108).toString()));
195-
196-
Configuration configuration = new Configuration(nodes, Duration.ofSeconds(5), "xyz");
197-
return new Client(configuration);
176+
public EmbeddingModel embeddingModel() {
177+
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("OPENAI_API_KEY")));
198178
}
199179
----
200180

201-
[NOTE]
202-
====
203-
It is more convenient and preferred to create the `TypesenseVectorStore` as a Bean.
204-
But if you decide to create it manually, then you must call the `TypesenseVectorStore#afterPropertiesSet()` after setting the properties and before using the client.
205-
====
181+
== Metadata Filtering
206182

183+
You can leverage the generic portable xref:api/vectordbs.adoc#metadata-filters[metadata filters] with Typesense store as well.
207184

208-
Then in your main code, create some documents:
185+
For example you can use either the text expression language:
209186

210187
[source,java]
211188
----
212-
List<Document> documents = List.of(
213-
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("country", "UK", "year", 2020)),
214-
new Document("The World is Big and Salvation Lurks Around the Corner", Map.of()),
215-
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("country", "NL", "year", 2023)));
189+
vectorStore.similaritySearch(
190+
SearchRequest.defaults()
191+
.withQuery("The World")
192+
.withTopK(TOP_K)
193+
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
194+
.withFilterExpression("country in ['UK', 'NL'] && year >= 2020"));
216195
----
217196

218-
Now add the documents to your vector store:
219-
197+
or programmatically using the `Filter.Expression` DSL:
220198

221199
[source,java]
222200
----
223-
vectorStore.add(documents);
201+
FilterExpressionBuilder b = new FilterExpressionBuilder();
202+
203+
vectorStore.similaritySearch(SearchRequest.defaults()
204+
.withQuery("The World")
205+
.withTopK(TOP_K)
206+
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
207+
.withFilterExpression(b.and(
208+
b.in("country", "UK", "NL"),
209+
b.gte("year", 2020)).build()));
224210
----
225211

226-
And finally, retrieve documents similar to a query:
212+
NOTE: Those (portable) filter expressions get automatically converted into link:https://typesense.org/docs/0.24.0/api/search.html#filter-parameters[Typesense Search Filters].
227213

228-
[source,java]
214+
For example this portable filter expression:
215+
216+
[source,sql]
229217
----
230-
List<Document> results = vectorStore.similaritySearch(
231-
SearchRequest
232-
.query("Spring")
233-
.withTopK(5));
218+
country in ['UK', 'NL'] && year >= 2020
234219
----
235220

236-
If all goes well, you should retrieve the document containing the text "Spring AI rocks!!".
221+
is converted into the proprietary Typesense filter format:
222+
223+
[source,text]
224+
----
225+
country: ['UK', 'NL'] && year: >=2020
226+
----
237227

238228
[NOTE]
239229
====
240230
If you are not retrieving the documents in the expected order or the search results are not as expected, check the embedding model you are using.
241231
242232
Embedding models can have a significant impact on the search results (i.e. make sure if your data is in Spanish to use a Spanish or multilingual embedding model).
243233
====
244-

spring-ai-spring-boot-autoconfigure/src/main/java/org/springframework/ai/autoconfigure/vectorstore/typesense/TypesenseVectorStoreAutoConfiguration.java

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@
2828
import org.springframework.ai.embedding.BatchingStrategy;
2929
import org.springframework.ai.embedding.EmbeddingModel;
3030
import org.springframework.ai.embedding.TokenCountBatchingStrategy;
31-
import org.springframework.ai.vectorstore.TypesenseVectorStore;
32-
import org.springframework.ai.vectorstore.TypesenseVectorStore.TypesenseVectorStoreConfig;
31+
import org.springframework.ai.vectorstore.typesense.TypesenseVectorStore;
32+
import org.springframework.ai.vectorstore.typesense.TypesenseVectorStore.TypesenseVectorStoreConfig;
3333
import org.springframework.ai.vectorstore.observation.VectorStoreObservationConvention;
3434
import org.springframework.beans.factory.ObjectProvider;
3535
import org.springframework.boot.autoconfigure.AutoConfiguration;
@@ -70,14 +70,16 @@ public TypesenseVectorStore vectorStore(Client typesenseClient, EmbeddingModel e
7070
ObjectProvider<VectorStoreObservationConvention> customObservationConvention,
7171
BatchingStrategy batchingStrategy) {
7272

73-
TypesenseVectorStoreConfig config = TypesenseVectorStoreConfig.builder()
74-
.withCollectionName(properties.getCollectionName())
75-
.withEmbeddingDimension(properties.getEmbeddingDimension())
73+
return TypesenseVectorStore.builder()
74+
.client(typesenseClient)
75+
.embeddingModel(embeddingModel)
76+
.collectionName(properties.getCollectionName())
77+
.embeddingDimension(properties.getEmbeddingDimension())
78+
.initializeSchema(properties.isInitializeSchema())
79+
.observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
80+
.customObservationConvention(customObservationConvention.getIfAvailable(() -> null))
81+
.batchingStrategy(batchingStrategy)
7682
.build();
77-
78-
return new TypesenseVectorStore(typesenseClient, embeddingModel, config, properties.isInitializeSchema(),
79-
observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP),
80-
customObservationConvention.getIfAvailable(() -> null), batchingStrategy);
8183
}
8284

8385
@Bean

0 commit comments

Comments
 (0)