Skip to content

Commit 8edf1be

Browse files
sobychackomarkpollack
authored andcommitted
Add builder pattern to MariaDBVectorStore and refactor package name
Introduces a builder pattern for configuring MariaDBVectorStore instances and improves the overall implementation. This change: - Makes configuration more flexible and type-safe through builder methods - Deprecates old constructors and builder in favor of the new builder pattern - Adds comprehensive validation of configuration options - Improves documentation with clear examples and better structure - Updates all test classes to use the new builder pattern - Adds comprehensive builder tests - Updated reference documentation The builder pattern provides a more maintainable and user-friendly way to configure vector stores while ensuring configuration validity at compile time. This aligns with the project's move towards using builder patterns across all vector store implementations.
1 parent 8877672 commit 8edf1be

File tree

9 files changed

+731
-162
lines changed

9 files changed

+731
-162
lines changed
Lines changed: 120 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,28 @@
1-
= MariaDB Vector
1+
= MariaDB Vector Store
22

3-
This section walks you through setting up the MariaDB `VectorStore` to store document embeddings and perform similarity searches.
3+
This section walks you through setting up `MariaDBVectorStore` to store document embeddings and perform similarity searches.
44

5-
link:https://mariadb.org/projects/mariadb-vector/[MariaDB vector] is part of MariaDB 11.7 and enables storing and searching over machine learning-generated embeddings.
5+
link:https://mariadb.org/projects/mariadb-vector/[MariaDB Vector] is part of MariaDB 11.7 and enables storing and searching over machine learning-generated embeddings.
6+
It provides efficient vector similarity search capabilities using vector indexes, supporting both cosine similarity and Euclidean distance metrics.
7+
8+
== Prerequisites
9+
10+
* A running MariaDB (11.7+) instance. The following options are available:
11+
** link:https://hub.docker.com/_/mariadb[Docker] image
12+
** link:https://mariadb.org/download/[MariaDB Server]
13+
** link:https://mariadb.com/products/skysql/[MariaDB SkySQL]
14+
* If required, an API key for the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] to generate the embeddings stored by the `MariaDBVectorStore`.
615

716
== Auto-Configuration
817

9-
Add the MariaDBVectorStore boot starter dependency to your project:
18+
Spring AI provides Spring Boot auto-configuration for the MariaDB Vector Store.
19+
To enable it, add the following dependency to your project's Maven `pom.xml` file:
1020

1121
[source,xml]
1222
----
1323
<dependency>
14-
<groupId>org.springframework.ai</groupId>
15-
<artifactId>spring-ai-mariadb-store-spring-boot-starter</artifactId>
24+
<groupId>org.springframework.ai</groupId>
25+
<artifactId>spring-ai-mariadb-store-spring-boot-starter</artifactId>
1626
</dependency>
1727
----
1828

@@ -25,112 +35,158 @@ dependencies {
2535
}
2636
----
2737

38+
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
39+
2840
The vector store implementation can initialize the required schema for you, but you must opt-in by specifying the `initializeSchema` boolean in the appropriate constructor or by setting `...initialize-schema=true` in the `application.properties` file.
2941

30-
The Vector Store also requires an `EmbeddingModel` instance to calculate embeddings for the documents.
31-
You can pick one of the available xref:api/embeddings.adoc#available-implementations[EmbeddingModel Implementations].
42+
NOTE: This is a breaking change! In earlier versions of Spring AI, this schema initialization happened by default.
43+
44+
Additionally, you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
3245

33-
For example, to use the xref:api/embeddings/openai-embeddings.adoc[OpenAI EmbeddingModel], add the following dependency to your project:
46+
For example, to use the xref:api/embeddings/openai-embeddings.adoc[OpenAI EmbeddingModel], add the following dependency:
3447

3548
[source,xml]
3649
----
3750
<dependency>
38-
<groupId>org.springframework.ai</groupId>
39-
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
51+
<groupId>org.springframework.ai</groupId>
52+
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
4053
</dependency>
4154
----
4255

43-
or to your Gradle `build.gradle` build file.
56+
TIP: Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file.
4457

45-
[source,groovy]
58+
Now you can auto-wire the `MariaDBVectorStore` in your application:
59+
60+
[source,java]
4661
----
47-
dependencies {
48-
implementation 'org.springframework.ai:spring-ai-openai-spring-boot-starter'
49-
}
62+
@Autowired VectorStore vectorStore;
63+
64+
// ...
65+
66+
List<Document> documents = List.of(
67+
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
68+
new Document("The World is Big and Salvation Lurks Around the Corner"),
69+
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
70+
71+
// Add the documents to MariaDB
72+
vectorStore.add(documents);
73+
74+
// Retrieve documents similar to a query
75+
List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
5076
----
5177

52-
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
53-
Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file.
78+
[[mariadbvector-properties]]
79+
=== Configuration Properties
5480

55-
To connect to and configure the `MariaDBVectorStore`, you need to provide access details for your instance.
56-
A simple configuration can be provided via Spring Boot's `application.yml`.
81+
To connect to MariaDB and use the `MariaDBVectorStore`, you need to provide access details for your instance.
82+
A simple configuration can be provided via Spring Boot's `application.yml`:
5783

58-
[yml]
84+
[source,yaml]
5985
----
6086
spring:
6187
datasource:
6288
url: jdbc:mariadb://localhost/db
6389
username: myUser
6490
password: myPassword
6591
ai:
66-
vectorstore:
67-
mariadbvector:
68-
distance-type: COSINE
69-
dimensions: 1536
92+
vectorstore:
93+
mariadb:
94+
initialize-schema: true
95+
distance-type: COSINE
96+
dimensions: 1536
7097
----
7198

72-
TIP: If you run MariaDBvector as a Spring Boot dev service via link:https://docs.spring.io/spring-boot/reference/features/dev-services.html#features.dev-services.docker-compose[Docker Compose]
99+
TIP: If you run MariaDB Vector as a Spring Boot dev service via link:https://docs.spring.io/spring-boot/reference/features/dev-services.html#features.dev-services.docker-compose[Docker Compose]
73100
or link:https://docs.spring.io/spring-boot/reference/features/dev-services.html#features.dev-services.testcontainers[Testcontainers],
74101
you don't need to configure URL, username and password since they are autoconfigured by Spring Boot.
75102

76-
TIP: Check the list of xref:#mariadbvector-properties[configuration parameters] to learn about the default values and configuration options.
103+
Properties starting with `spring.ai.vectorstore.mariadb.*` are used to configure the `MariaDBVectorStore`:
77104

78-
Now you can auto-wire the `MariaDBVectorStore` in your application and use it
105+
[cols="2,5,1",stripes=even]
106+
|===
107+
|Property | Description | Default Value
79108

80-
[source,java]
81-
----
82-
@Autowired VectorStore vectorStore;
109+
|`spring.ai.vectorstore.mariadb.initialize-schema`| Whether to initialize the required schema | `false`
110+
|`spring.ai.vectorstore.mariadb.distance-type`| Search distance type. Use `COSINE` (default) or `EUCLIDEAN`. If vectors are normalized to length 1, you can use `EUCLIDEAN` for best performance.| `COSINE`
111+
|`spring.ai.vectorstore.mariadb.dimensions`| Embeddings dimension. If not specified explicitly, will retrieve dimensions from the provided `EmbeddingModel`. | `1536`
112+
|`spring.ai.vectorstore.mariadb.remove-existing-vector-store-table` | Deletes the existing vector store table on startup. | `false`
113+
|`spring.ai.vectorstore.mariadb.schema-name` | Vector store schema name | `null`
114+
|`spring.ai.vectorstore.mariadb.table-name` | Vector store table name | `vector_store`
115+
|`spring.ai.vectorstore.mariadb.schema-validation` | Enables schema and table name validation to ensure they are valid and existing objects. | `false`
116+
|===
83117

84-
// ...
118+
TIP: If you configure a custom schema and/or table name, consider enabling schema validation by setting `spring.ai.vectorstore.mariadb.schema-validation=true`.
119+
This ensures the correctness of the names and reduces the risk of SQL injection attacks.
85120

86-
List<Document> documents = List.of(
87-
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
88-
new Document("The World is Big and Salvation Lurks Around the Corner"),
89-
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
121+
== Manual Configuration
90122

91-
// Add the documents to PGVector
92-
vectorStore.add(documents);
123+
Instead of using the Spring Boot auto-configuration, you can manually configure the MariaDB vector store. For this you need to add the following dependencies to your project:
93124

94-
// Retrieve documents similar to a query
95-
List<Document> results = this.vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
125+
[source,xml]
96126
----
127+
<dependency>
128+
<groupId>org.springframework.boot</groupId>
129+
<artifactId>spring-boot-starter-jdbc</artifactId>
130+
</dependency>
97131
98-
[[mariadbvector-properties]]
99-
=== Configuration properties
132+
<dependency>
133+
<groupId>org.mariadb.jdbc</groupId>
134+
<artifactId>mariadb-java-client</artifactId>
135+
<scope>runtime</scope>
136+
</dependency>
100137
101-
You can use the following properties in your Spring Boot configuration to customize the MariaDB vector store.
138+
<dependency>
139+
<groupId>org.springframework.ai</groupId>
140+
<artifactId>spring-ai-mariadb-store</artifactId>
141+
</dependency>
142+
----
102143

103-
[cols="2,5,1",stripes=even]
104-
|===
105-
|Property| Description | Default value
144+
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
106145

107-
|`spring.ai.vectorstore.mariadb.distance-type`| Search distance type. Defaults to `COSINE`. But if vectors are normalized to length 1, you can use `EUCLIDEAN` for best performance.| COSINE
108-
|`spring.ai.vectorstore.mariadb.dimensions`| Embeddings dimension. If not specified explicitly the PgVectorStore will retrieve the dimensions form the provided `EmbeddingModel`. Dimensions are set to the embedding column the on table creation. If you change the dimensions your would have to re-create the vector_store table as well. | -
109-
|`spring.ai.vectorstore.mariadb.remove-existing-vector-store-table` | Deletes the existing `vector_store` table on start up. | false
110-
|`spring.ai.vectorstore.mariadb.initialize-schema` | Whether to initialize the required schema | false
111-
|`spring.ai.vectorstore.mariadb.schema-name` | Vector store schema name | null
112-
|`spring.ai.vectorstore.mariadb.table-name` | Vector store table name | `vector_store`
113-
|`spring.ai.vectorstore.mariadb.schema-validation` | Enables schema and table name validation to ensure they are valid and existing objects. | false
146+
Then create the `MariaDBVectorStore` bean using the builder pattern:
114147

115-
|===
148+
[source,java]
149+
----
150+
@Bean
151+
public VectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingModel embeddingModel) {
152+
return MariaDBVectorStore.builder(jdbcTemplate)
153+
.embeddingModel(embeddingModel)
154+
.dimensions(1536) // Optional: defaults to 1536
155+
.distanceType(MariaDBDistanceType.COSINE) // Optional: defaults to COSINE
156+
.schemaName("mydb") // Optional: defaults to null
157+
.vectorTableName("custom_vectors") // Optional: defaults to "vector_store"
158+
.contentFieldName("text") // Optional: defaults to "content"
159+
.embeddingFieldName("embedding") // Optional: defaults to "embedding"
160+
.idFieldName("doc_id") // Optional: defaults to "id"
161+
.metadataFieldName("meta") // Optional: defaults to "metadata"
162+
.initializeSchema(true) // Optional: defaults to false
163+
.schemaValidation(true) // Optional: defaults to false
164+
.removeExistingVectorStoreTable(false) // Optional: defaults to false
165+
.maxDocumentBatchSize(10000) // Optional: defaults to 10000
166+
.build();
167+
}
116168
117-
TIP: If you configure a custom schema and/or table name, consider enabling schema validation by setting `spring.ai.vectorstore.mariadb.schema-validation=true`.
118-
This ensures the correctness of the names and reduces the risk of SQL injection attacks.
169+
// This can be any EmbeddingModel implementation
170+
@Bean
171+
public EmbeddingModel embeddingModel() {
172+
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("OPENAI_API_KEY")));
173+
}
174+
----
119175

120-
== Metadata filtering
176+
== Metadata Filtering
121177

122-
You can leverage the generic, portable link:https://docs.spring.io/spring-ai/reference/api/vectordbs.html#_metadata_filters[metadata filters] with the MariaDB Vector store.
178+
You can leverage the generic, portable xref:api/vectordbs.adoc#metadata-filters[metadata filters] with MariaDB Vector store.
123179

124180
For example, you can use either the text expression language:
125181

126182
[source,java]
127183
----
128184
vectorStore.similaritySearch(
129185
SearchRequest.defaults()
130-
.withQuery("The World")
131-
.withTopK(TOP_K)
132-
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
133-
.withFilterExpression("author in ['john', 'jill'] && article_type == 'blog'"));
186+
.withQuery("The World")
187+
.withTopK(TOP_K)
188+
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
189+
.withFilterExpression("author in ['john', 'jill'] && article_type == 'blog'"));
134190
----
135191

136192
or programmatically using the `Filter.Expression` DSL:
@@ -144,44 +200,8 @@ vectorStore.similaritySearch(SearchRequest.defaults()
144200
.withTopK(TOP_K)
145201
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
146202
.withFilterExpression(b.and(
147-
b.in("author","john", "jill"),
203+
b.in("author", "john", "jill"),
148204
b.eq("article_type", "blog")).build()));
149205
----
150206

151-
NOTE: These filter expressions are converted into the equivalent PgVector filters.
152-
153-
== Manual Configuration
154-
155-
Instead of using the Spring Boot auto-configuration, you can manually configure the `MariaDBVectorStore`.
156-
For this you need to add the MariaDB connector and `JdbcTemplate` auto-configuration dependencies to your project:
157-
158-
[source,xml]
159-
----
160-
<dependency>
161-
<groupId>org.springframework.boot</groupId>
162-
<artifactId>spring-boot-starter-jdbc</artifactId>
163-
</dependency>
164-
165-
<dependency>
166-
<groupId>org.mariadb.jdbc</groupId>
167-
<artifactId>mariadb-java-client</artifactId>
168-
<scope>runtime</scope>
169-
</dependency>
170-
171-
<dependency>
172-
<groupId>org.springframework.ai</groupId>
173-
<artifactId>spring-ai-mariadb-store</artifactId>
174-
</dependency>
175-
----
176-
177-
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
178-
179-
To configure MariaDB Vector in your application, you can use the following setup:
180-
181-
[source,java]
182-
----
183-
@Bean
184-
public VectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingModel embeddingModel) {
185-
return new MariaDBVectorStore(jdbcTemplate, embeddingModel);
186-
}
187-
----
207+
NOTE: These filter expressions are automatically converted into the equivalent MariaDB JSON path expressions.

spring-ai-spring-boot-autoconfigure/src/main/java/org/springframework/ai/autoconfigure/vectorstore/mariadb/MariaDbStoreAutoConfiguration.java

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -57,21 +57,23 @@ public MariaDBVectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingModel
5757

5858
var initializeSchema = properties.isInitializeSchema();
5959

60-
return new MariaDBVectorStore.Builder(jdbcTemplate, embeddingModel).withSchemaName(properties.getSchemaName())
61-
.withVectorTableName(properties.getTableName())
62-
.withVectorTableValidationsEnabled(properties.isSchemaValidation())
63-
.withDimensions(properties.getDimensions())
64-
.withDistanceType(properties.getDistanceType())
65-
.withContentFieldName(properties.getContentFieldName())
66-
.withEmbeddingFieldName(properties.getEmbeddingFieldName())
67-
.withIdFieldName(properties.getIdFieldName())
68-
.withMetadataFieldName(properties.getMetadataFieldName())
69-
.withRemoveExistingVectorStoreTable(properties.isRemoveExistingVectorStoreTable())
70-
.withInitializeSchema(initializeSchema)
71-
.withObservationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
72-
.withSearchObservationConvention(customObservationConvention.getIfAvailable(() -> null))
73-
.withBatchingStrategy(batchingStrategy)
74-
.withMaxDocumentBatchSize(properties.getMaxDocumentBatchSize())
60+
return MariaDBVectorStore.builder(jdbcTemplate)
61+
.embeddingModel(embeddingModel)
62+
.schemaName(properties.getSchemaName())
63+
.vectorTableName(properties.getTableName())
64+
.schemaValidation(properties.isSchemaValidation())
65+
.dimensions(properties.getDimensions())
66+
.distanceType(properties.getDistanceType())
67+
.contentFieldName(properties.getContentFieldName())
68+
.embeddingFieldName(properties.getEmbeddingFieldName())
69+
.idFieldName(properties.getIdFieldName())
70+
.metadataFieldName(properties.getMetadataFieldName())
71+
.removeExistingVectorStoreTable(properties.isRemoveExistingVectorStoreTable())
72+
.initializeSchema(initializeSchema)
73+
.observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
74+
.customObservationConvention(customObservationConvention.getIfAvailable(() -> null))
75+
.batchingStrategy(batchingStrategy)
76+
.maxDocumentBatchSize(properties.getMaxDocumentBatchSize())
7577
.build();
7678
}
7779

0 commit comments

Comments
 (0)