Skip to content

Commit fc1f92d

Browse files
sobychackomarkpollack
authored andcommitted
Add builder pattern and refactor Elasticsearch store package
The changes introduce a fluent builder pattern for ElasticsearchVectorStore configuration, making it easier to create and customize instances with optional parameters. All Elasticsearch-related classes are moved to a dedicated elasticsearch package for better organization. Key changes: * Add ElasticsearchVectorStore.builder() with comprehensive options * Move classes to org.springframework.ai.vectorstore.elasticsearch package * Deprecate old constructors in favor of builder pattern * Add support for configurable batching strategies * Enhance documentation with usage examples and best practices
1 parent 677a18e commit fc1f92d

File tree

12 files changed

+285
-83
lines changed

12 files changed

+285
-83
lines changed

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/elasticsearch.adoc

Lines changed: 36 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -76,14 +76,11 @@ Alternatively you can opt-out the initialization and create the index manually u
7676

7777
NOTE: this is a breaking change! In earlier versions of Spring AI, this schema initialization happened by default.
7878

79-
80-
8179
Please have a look at the list of <<elasticsearchvector-properties,configuration parameters>> for the vector store to learn about the default values and configuration options.
8280
These properties can be also set by configuring the `ElasticsearchVectorStoreOptions` bean.
8381

8482
Additionally, you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
8583

86-
8784
Now you can auto-wire the `ElasticsearchVectorStore` as a vector store in your application.
8885

8986
[source,java]
@@ -97,7 +94,7 @@ List <Document> documents = List.of(
9794
new Document("The World is Big and Salvation Lurks Around the Corner"),
9895
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
9996
100-
// Add the documents to Qdrant
97+
// Add the documents to Elasticsearch
10198
vectorStore.add(documents);
10299
103100
// Retrieve documents similar to a query
@@ -117,34 +114,19 @@ spring:
117114
uris: <elasticsearch instance URIs>
118115
username: <elasticsearch username>
119116
password: <elasticsearch password>
120-
# API key if needed, e.g. OpenAI
121117
ai:
122-
openai:
123-
api:
124-
key: <api-key>
125-
----
126-
127-
environment variables,
128-
129-
[source,bash]
130-
----
131-
export SPRING_ELASTICSEARCH_URIS=<elasticsearch instance URIs>
132-
export SPRING_ELASTICSEARCH_USERNAME=<elasticsearch username>
133-
export SPRING_ELASTICSEARCH_PASSWORD=<elasticsearch password>
134-
# API key if needed, e.g. OpenAI
135-
export SPRING_AI_OPENAI_API_KEY=<api-key>
118+
vectorstore:
119+
elasticsearch:
120+
initialize-schema: true
121+
index-name: custom-index
122+
dimensions: 1536
123+
similarity: cosine
124+
batching-strategy: TOKEN_COUNT # Optional: Controls how documents are batched for embedding
136125
----
137126

138-
or can be a mix of those.
139-
For example, if you want to store your password as an environment variable but keep the rest in the plain `application.yml` file.
140-
141-
NOTE: If you choose to create a shell script for ease in future work, be sure to run it prior to starting your application by "sourcing" the file, i.e. `source <your_script_name>.sh`.
142-
143-
Spring Boot's auto-configuration feature for the Elasticsearch RestClient will create a bean instance that will be used by the `ElasticsearchVectorStore`.
144-
145127
The Spring Boot properties starting with `spring.elasticsearch.*` are used to configure the Elasticsearch client:
146128

147-
[stripes=even]
129+
[cols="2,5,1",stripes=even]
148130
|===
149131
|Property | Description | Default Value
150132

@@ -160,23 +142,24 @@ The Spring Boot properties starting with `spring.elasticsearch.*` are used to co
160142
| `spring.elasticsearch.socket-timeout` | Socket timeout used when communicating with Elasticsearch. | `30s`
161143
|===
162144

163-
Properties starting with the `spring.ai.vectorstore.elasticsearch.*` prefix are used to configure `ElasticsearchVectorStore`.
145+
Properties starting with `spring.ai.vectorstore.elasticsearch.*` are used to configure the `ElasticsearchVectorStore`:
164146

165-
[stripes=even]
147+
[cols="2,5,1",stripes=even]
166148
|===
167149
|Property | Description | Default Value
168150

169-
|`spring.ai.vectorstore.elasticsearch.initialize-schema`| Whether to initialize the required schema | `false`
170-
|`spring.ai.vectorstore.elasticsearch.index-name` | The name of the index to store the vectors. | spring-ai-document-index
171-
|`spring.ai.vectorstore.elasticsearch.dimensions` | The number of dimensions in the vector. | 1536
172-
|`spring.ai.vectorstore.elasticsearch.similarity` | The similarity function to use. | `cosine`
151+
|`spring.ai.vectorstore.elasticsearch.initialize-schema`| Whether to initialize the required schema | `false`
152+
|`spring.ai.vectorstore.elasticsearch.index-name` | The name of the index to store the vectors | `spring-ai-document-index`
153+
|`spring.ai.vectorstore.elasticsearch.dimensions` | The number of dimensions in the vector | `1536`
154+
|`spring.ai.vectorstore.elasticsearch.similarity` | The similarity function to use | `cosine`
155+
|`spring.ai.vectorstore.elasticsearch.batching-strategy` | Strategy for batching documents when calculating embeddings. Options are `TOKEN_COUNT` or `FIXED_SIZE` | `TOKEN_COUNT`
173156
|===
174157

175158
The following similarity functions are available:
176159

177-
* cosine
178-
* l2_norm
179-
* dot_product
160+
* `cosine` - Default, suitable for most use cases. Measures cosine similarity between vectors.
161+
* `l2_norm` - Euclidean distance between vectors. Lower values indicate higher similarity.
162+
* `dot_product` - Best performance for normalized vectors (e.g., OpenAI embeddings).
180163

181164
More details about each in the https://www.elastic.co/guide/en/elasticsearch/reference/master/dense-vector.html#dense-vector-params[Elasticsearch Documentation] on dense vectors.
182165

@@ -206,7 +189,7 @@ vectorStore.similaritySearch(SearchRequest.defaults()
206189
.withTopK(TOP_K)
207190
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
208191
.withFilterExpression(b.and(
209-
b.in("john", "jill"),
192+
b.in("author", "john", "jill"),
210193
b.eq("article_type", "blog")).build()));
211194
----
212195

@@ -247,35 +230,44 @@ dependencies {
247230
}
248231
----
249232

250-
251233
Create an Elasticsearch `RestClient` bean.
252234
Read the link:https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/current/java-rest-low-usage-initialization.html[Elasticsearch Documentation] for more in-depth information about the configuration of a custom RestClient.
253235

254236
[source,java]
255237
----
256238
@Bean
257239
public RestClient restClient() {
258-
RestClient.builder(new HttpHost("<host>", 9200, "http"))
240+
return RestClient.builder(new HttpHost("<host>", 9200, "http"))
259241
.setDefaultHeaders(new Header[]{
260242
new BasicHeader("Authorization", "Basic <encoded username and password>")
261243
})
262244
.build();
263245
}
264246
----
265247

266-
and then create the `ElasticsearchVectorStore` bean:
248+
Then create the `ElasticsearchVectorStore` bean using the builder pattern:
267249

268250
[source,java]
269251
----
270252
@Bean
271-
public ElasticsearchVectorStore vectorStore(EmbeddingModel embeddingModel, RestClient restClient) {
272-
return new ElasticsearchVectorStore( restClient, embeddingModel);
253+
public VectorStore vectorStore(RestClient restClient, EmbeddingModel embeddingModel) {
254+
ElasticsearchVectorStoreOptions options = new ElasticsearchVectorStoreOptions();
255+
options.setIndexName("custom-index"); // Optional: defaults to "spring-ai-document-index"
256+
options.setSimilarity(COSINE); // Optional: defaults to COSINE
257+
options.setDimensions(1536); // Optional: defaults to model dimensions or 1536
258+
259+
return ElasticsearchVectorStore.builder()
260+
.restClient(restClient)
261+
.embeddingModel(embeddingModel)
262+
.options(options) // Optional: use custom options
263+
.initializeSchema(true) // Optional: defaults to false
264+
.batchingStrategy(new TokenCountBatchingStrategy()) // Optional: defaults to TokenCountBatchingStrategy
265+
.build();
273266
}
274267
275-
// This can be any EmbeddingModel implementation.
268+
// This can be any EmbeddingModel implementation
276269
@Bean
277270
public EmbeddingModel embeddingModel() {
278271
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("OPENAI_API_KEY")));
279272
}
280273
----
281-

spring-ai-spring-boot-autoconfigure/src/main/java/org/springframework/ai/autoconfigure/vectorstore/elasticsearch/ElasticsearchVectorStoreAutoConfiguration.java

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@
2222
import org.springframework.ai.embedding.BatchingStrategy;
2323
import org.springframework.ai.embedding.EmbeddingModel;
2424
import org.springframework.ai.embedding.TokenCountBatchingStrategy;
25-
import org.springframework.ai.vectorstore.ElasticsearchVectorStore;
26-
import org.springframework.ai.vectorstore.ElasticsearchVectorStoreOptions;
25+
import org.springframework.ai.vectorstore.elasticsearch.ElasticsearchVectorStore;
26+
import org.springframework.ai.vectorstore.elasticsearch.ElasticsearchVectorStoreOptions;
2727
import org.springframework.ai.vectorstore.observation.VectorStoreObservationConvention;
2828
import org.springframework.beans.factory.ObjectProvider;
2929
import org.springframework.boot.autoconfigure.AutoConfiguration;
@@ -73,9 +73,15 @@ ElasticsearchVectorStore vectorStore(ElasticsearchVectorStoreProperties properti
7373
elasticsearchVectorStoreOptions.setSimilarity(properties.getSimilarity());
7474
}
7575

76-
return new ElasticsearchVectorStore(elasticsearchVectorStoreOptions, restClient, embeddingModel,
77-
properties.isInitializeSchema(), observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP),
78-
customObservationConvention.getIfAvailable(() -> null), batchingStrategy);
76+
return ElasticsearchVectorStore.builder()
77+
.restClient(restClient)
78+
.options(elasticsearchVectorStoreOptions)
79+
.embeddingModel(embeddingModel)
80+
.initializeSchema(properties.isInitializeSchema())
81+
.observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
82+
.customObservationConvention(customObservationConvention.getIfAvailable(() -> null))
83+
.batchingStrategy(batchingStrategy)
84+
.build();
7985
}
8086

8187
}

spring-ai-spring-boot-autoconfigure/src/main/java/org/springframework/ai/autoconfigure/vectorstore/elasticsearch/ElasticsearchVectorStoreProperties.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
package org.springframework.ai.autoconfigure.vectorstore.elasticsearch;
1818

1919
import org.springframework.ai.autoconfigure.vectorstore.CommonVectorStoreProperties;
20-
import org.springframework.ai.vectorstore.SimilarityFunction;
20+
import org.springframework.ai.vectorstore.elasticsearch.SimilarityFunction;
2121
import org.springframework.boot.context.properties.ConfigurationProperties;
2222

2323
/**

spring-ai-spring-boot-autoconfigure/src/test/java/org/springframework/ai/autoconfigure/vectorstore/elasticsearch/ElasticsearchVectorStoreAutoConfigurationIT.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,9 @@
3333
import org.springframework.ai.autoconfigure.retry.SpringAiRetryAutoConfiguration;
3434
import org.springframework.ai.document.Document;
3535
import org.springframework.ai.observation.conventions.VectorStoreProvider;
36-
import org.springframework.ai.vectorstore.ElasticsearchVectorStore;
36+
import org.springframework.ai.vectorstore.elasticsearch.ElasticsearchVectorStore;
3737
import org.springframework.ai.vectorstore.SearchRequest;
38-
import org.springframework.ai.vectorstore.SimilarityFunction;
38+
import org.springframework.ai.vectorstore.elasticsearch.SimilarityFunction;
3939
import org.springframework.ai.vectorstore.observation.VectorStoreObservationContext;
4040
import org.springframework.boot.autoconfigure.AutoConfigurations;
4141
import org.springframework.boot.autoconfigure.elasticsearch.ElasticsearchRestClientAutoConfiguration;
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
* limitations under the License.
1515
*/
1616

17-
package org.springframework.ai.vectorstore;
17+
package org.springframework.ai.vectorstore.elasticsearch;
1818

1919
import java.text.ParseException;
2020
import java.text.SimpleDateFormat;

0 commit comments

Comments
 (0)