Skip to content

Commit d3d34c9

Browse files
sobychackomarkpollack
authored andcommitted
Add builder pattern to OpenSearchVectorStore and refactor package name
Add builder pattern to OpenSearchVectorStore Introduces a builder pattern for OpenSearchVectorStore configuration and refactors the package structure to org.springframework.ai.vectorstore.opensearch for better organization and consistency with other vector stores. The builder pattern improves usability by: * Providing a fluent API for configuring store instances * Making configuration options more discoverable through method names * Enabling better validation of configuration parameters * Supporting optional parameters with sensible defaults * The package refactoring aligns with the project's standard package naming conventions and improves code organization. All constructors are deprecated in favor of the new builder pattern to guide users toward the preferred configuration approach.
1 parent f69d879 commit d3d34c9

File tree

11 files changed

+485
-190
lines changed

11 files changed

+485
-190
lines changed

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/opensearch.adoc

Lines changed: 129 additions & 135 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,89 @@
11
= OpenSearch
22

3-
This section guides you through setting up the OpenSearch `VectorStore` to store document embeddings and perform similarity searches.
3+
This section walks you through setting up `OpenSearchVectorStore` to store document embeddings and perform similarity searches.
44

5-
link:https://opensearch.org[OpenSearch] is an open-source search and analytics engine originally forked from Elasticsearch, distributed under the Apache License 2.0. It enhances AI application development by simplifying the integration and management of AI-generated assets. OpenSearch supports vector, lexical, and hybrid search capabilities, leveraging advanced vector database functionalities to facilitate low-latency queries and similarity searches as detailed on the link:https://opensearch.org/platform/search/vector-database.html[vector database page]. This platform is ideal for building scalable AI-driven applications and offers robust tools for data management, fault tolerance, and resource access controls.
5+
link:https://opensearch.org[OpenSearch] is an open-source search and analytics engine originally forked from Elasticsearch, distributed under the Apache License 2.0. It enhances AI application development by simplifying the integration and management of AI-generated assets. OpenSearch supports vector, lexical, and hybrid search capabilities, leveraging advanced vector database functionalities to facilitate low-latency queries and similarity searches as detailed on the link:https://opensearch.org/platform/search/vector-database.html[vector database page].
6+
7+
The link:https://opensearch.org/docs/latest/search-plugins/knn/index/[OpenSearch k-NN] functionality allows users to query vector embeddings from large datasets. An embedding is a numerical representation of a data object, such as text, image, audio, or document. Embeddings can be stored in the index and queried using various similarity functions.
68

79
== Prerequisites
810

911
* A running OpenSearch instance. The following options are available:
1012
** link:https://opensearch.org/docs/latest/opensearch/install/index/[Self-Managed OpenSearch]
1113
** link:https://docs.aws.amazon.com/opensearch-service/[Amazon OpenSearch Service]
12-
* `EmbeddingModel` instance to compute the document embeddings. Several options are available:
13-
- If required, an API key for the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] to generate the
14-
embeddings stored by the `OpenSearchVectorStore`.
14+
* If required, an API key for the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] to generate the embeddings stored by the `OpenSearchVectorStore`.
1515

16-
== Dependencies
16+
== Auto-configuration
1717

18-
Add the OpenSearch Vector Store dependency to your project:
18+
Spring AI provides Spring Boot auto-configuration for the OpenSearch Vector Store.
19+
To enable it, add the following dependency to your project's Maven `pom.xml` file:
1920

20-
[tabs]
21-
======
22-
Maven::
23-
+
2421
[source,xml]
2522
----
2623
<dependency>
2724
<groupId>org.springframework.ai</groupId>
28-
<artifactId>spring-ai-opensearch-store</artifactId>
25+
<artifactId>spring-ai-opensearch-store-spring-boot-starter</artifactId>
2926
</dependency>
3027
----
3128

32-
Gradle::
33-
+
29+
or to your Gradle `build.gradle` build file:
30+
3431
[source,groovy]
3532
----
3633
dependencies {
37-
implementation 'org.springframework.ai:spring-ai-opensearch-store'
34+
implementation 'org.springframework.ai:spring-ai-opensearch-store-spring-boot-starter'
3835
}
3936
----
40-
======
4137

4238
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
4339

44-
== Configuration
40+
For Amazon OpenSearch Service, use these dependencies instead:
41+
42+
[source,xml]
43+
----
44+
<dependency>
45+
<groupId>org.springframework.ai</groupId>
46+
<artifactId>spring-ai-aws-opensearch-store-spring-boot-starter</artifactId>
47+
</dependency>
48+
----
49+
50+
or for Gradle:
51+
52+
[source,groovy]
53+
----
54+
dependencies {
55+
implementation 'org.springframework.ai:spring-ai-aws-opensearch-store-spring-boot-starter'
56+
}
57+
----
58+
59+
Please have a look at the list of xref:#_configuration_properties[configuration parameters] for the vector store to learn about the default values and configuration options.
60+
61+
Additionally, you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
62+
63+
Now you can auto-wire the `OpenSearchVectorStore` as a vector store in your application:
64+
65+
[source,java]
66+
----
67+
@Autowired VectorStore vectorStore;
68+
69+
// ...
70+
71+
List<Document> documents = List.of(
72+
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
73+
new Document("The World is Big and Salvation Lurks Around the Corner"),
74+
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
75+
76+
// Add the documents to OpenSearch
77+
vectorStore.add(documents);
78+
79+
// Retrieve documents similar to a query
80+
List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
81+
----
82+
83+
=== Configuration Properties
4584

4685
To connect to OpenSearch and use the `OpenSearchVectorStore`, you need to provide access details for your instance.
47-
A simple configuration can either be provided via Spring Boot's `application.yml`,
86+
A simple configuration can be provided via Spring Boot's `application.yml`:
4887

4988
[source,yaml]
5089
----
@@ -55,181 +94,136 @@ spring:
5594
uris: <opensearch instance URIs>
5695
username: <opensearch username>
5796
password: <opensearch password>
58-
indexName: <opensearch index name>
59-
mappingJson: <JSON mapping for opensearch index>
60-
aws:
97+
index-name: spring-ai-document-index
98+
initialize-schema: true
99+
similarity-function: cosinesimil
100+
batching-strategy: TOKEN_COUNT
101+
aws: # Only for Amazon OpenSearch Service
61102
host: <aws opensearch host>
62-
serviceName: <aws service name>
63-
accessKey: <aws access key>
64-
secretKey: <aws secret key>
103+
service-name: <aws service name>
104+
access-key: <aws access key>
105+
secret-key: <aws secret key>
65106
region: <aws region>
66-
# API key if needed, e.g. OpenAI
67-
openai:
68-
apiKey: <api-key>
69107
----
70-
TIP: Check the list of xref:#_configuration_properties[configuration parameters] to learn about the default values and configuration options.
71-
72-
== Auto-configuration
73-
74-
=== Self-Managed OpenSearch
75108

76-
Spring AI provides Spring Boot auto-configuration for the OpenSearch Vector Store.
77-
To enable it, add the following dependency to your project's Maven `pom.xml` or Gradle `build.gradle` build files:
109+
Properties starting with `spring.ai.vectorstore.opensearch.*` are used to configure the `OpenSearchVectorStore`:
78110

79-
[tabs]
80-
======
81-
Maven::
82-
+
83-
[source,xml]
84-
----
85-
<dependency>
86-
<groupId>org.springframework.ai</groupId>
87-
<artifactId>spring-ai-opensearch-store-spring-boot-starter</artifactId>
88-
</dependency>
89-
----
111+
[cols="2,5,1",stripes=even]
112+
|===
113+
|Property | Description | Default Value
114+
115+
|`spring.ai.vectorstore.opensearch.uris`| URIs of the OpenSearch cluster endpoints | -
116+
|`spring.ai.vectorstore.opensearch.username`| Username for accessing the OpenSearch cluster | -
117+
|`spring.ai.vectorstore.opensearch.password`| Password for the specified username | -
118+
|`spring.ai.vectorstore.opensearch.index-name`| Name of the index to store vectors | `spring-ai-document-index`
119+
|`spring.ai.vectorstore.opensearch.initialize-schema`| Whether to initialize the required schema | `false`
120+
|`spring.ai.vectorstore.opensearch.similarity-function`| The similarity function to use | `cosinesimil`
121+
|`spring.ai.vectorstore.opensearch.batching-strategy`| Strategy for batching documents when calculating embeddings. Options are `TOKEN_COUNT` or `FIXED_SIZE` | `TOKEN_COUNT`
122+
|`spring.ai.vectorstore.opensearch.aws.host`| Hostname of the OpenSearch instance | -
123+
|`spring.ai.vectorstore.opensearch.aws.service-name`| AWS service name | -
124+
|`spring.ai.vectorstore.opensearch.aws.access-key`| AWS access key | -
125+
|`spring.ai.vectorstore.opensearch.aws.secret-key`| AWS secret key | -
126+
|`spring.ai.vectorstore.opensearch.aws.region`| AWS region | -
127+
|===
90128

91-
Gradle::
92-
+
93-
[source,groovy]
94-
----
95-
dependencies {
96-
implementation 'org.springframework.ai:spring-ai-opensearch-store-spring-boot-starter'
97-
}
98-
----
99-
======
129+
The following similarity functions are available:
100130

101-
Then use the `spring.ai.vectorstore.opensearch.*` properties to configure the connection to the self-managed OpenSearch instance.
131+
* `cosinesimil` - Default, suitable for most use cases. Measures cosine similarity between vectors.
132+
* `l1` - Manhattan distance between vectors.
133+
* `l2` - Euclidean distance between vectors.
134+
* `linf` - Chebyshev distance between vectors.
102135

103-
=== Amazon OpenSearch Service
136+
== Manual Configuration
104137

105-
To enable Amazon OpenSearch Service., add the following dependency to your project's Maven `pom.xml` or Gradle `build.gradle` build files:
138+
Instead of using the Spring Boot auto-configuration, you can manually configure the OpenSearch vector store. For this you need to add the `spring-ai-opensearch-store` to your project:
106139

107-
[tabs]
108-
======
109-
Maven::
110-
+
111140
[source,xml]
112141
----
113142
<dependency>
114143
<groupId>org.springframework.ai</groupId>
115-
<artifactId>spring-ai-aws-opensearch-store-spring-boot-starter</artifactId>
144+
<artifactId>spring-ai-opensearch-store</artifactId>
116145
</dependency>
117146
----
118147

119-
Gradle::
120-
+
148+
or to your Gradle `build.gradle` build file:
149+
121150
[source,groovy]
122151
----
123152
dependencies {
124-
implementation 'org.springframework.ai:spring-ai-aws-opensearch-store-spring-boot-starter'
153+
implementation 'org.springframework.ai:spring-ai-opensearch-store'
125154
}
126155
----
127-
======
128-
129-
Then use the `spring.ai.vectorstore.opensearch.aws.*` properties to configure the connection to the Amazon OpenSearch Service.
130156

131157
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
132158

133-
Here is an example of the needed bean:
159+
Create an OpenSearch client bean:
134160

135161
[source,java]
136162
----
137163
@Bean
138-
public EmbeddingModel embeddingModel() {
139-
// Can be any other EmbeddingModel implementation
140-
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY")));
164+
public OpenSearchClient openSearchClient() {
165+
RestClient restClient = RestClient.builder(
166+
HttpHost.create("http://localhost:9200"))
167+
.build();
168+
169+
return new OpenSearchClient(new RestClientTransport(
170+
restClient, new JacksonJsonpMapper()));
141171
}
142172
----
143173

144-
Now you can auto-wire the `OpenSearchVectorStore` as a vector store in your application.
174+
Then create the `OpenSearchVectorStore` bean using the builder pattern:
145175

146176
[source,java]
147177
----
148-
@Autowired VectorStore vectorStore;
149-
150-
// ...
151-
152-
List <Document> documents = List.of(
153-
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
154-
new Document("The World is Big and Salvation Lurks Around the Corner"),
155-
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
156-
157-
// Add the documents to OpenSearch
158-
vectorStore.add(List.of(document));
159-
160-
// Retrieve documents similar to a query
161-
List<Document> results = this.vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
162-
----
163-
164-
=== Configuration properties
165-
166-
You can use the following properties in your Spring Boot configuration to customize the OpenSearch vector store.
167-
168-
[cols="2,5,1",stripes=even]
169-
|===
170-
|Property| Description | Default value
171-
172-
|`spring.ai.vectorstore.opensearch.uris`| URIs of the OpenSearch cluster endpoints. | -
173-
|`spring.ai.vectorstore.opensearch.username`| Username for accessing the OpenSearch cluster. | -
174-
|`spring.ai.vectorstore.opensearch.password`| Password for the specified username. | -
175-
|`spring.ai.vectorstore.opensearch.indexName`| Name of the default index to be used within the OpenSearch cluster. | `spring-ai-document-index`
176-
|`spring.ai.vectorstore.opensearch.mappingJson`| JSON string defining the mapping for the index; specifies how documents and their
177-
fields are stored and indexed. Refer link:https://opensearch.org/docs/latest/search-plugins/vector-search/[here] for some sample configurations |
178-
{
179-
"properties":{
180-
"embedding":{
181-
"type":"knn_vector",
182-
"dimension":1536
183-
}
184-
}
178+
@Bean
179+
public VectorStore vectorStore(OpenSearchClient openSearchClient, EmbeddingModel embeddingModel) {
180+
return OpenSearchVectorStore.builder()
181+
.openSearchClient(openSearchClient)
182+
.embeddingModel(embeddingModel)
183+
.index("custom-index") // Optional: defaults to "spring-ai-document-index"
184+
.similarityFunction("l2") // Optional: defaults to "cosinesimil"
185+
.initializeSchema(true) // Optional: defaults to false
186+
.batchingStrategy(new TokenCountBatchingStrategy()) // Optional: defaults to TokenCountBatchingStrategy
187+
.build();
185188
}
186-
|`spring.ai.vectorstore.opensearch.aws.host`| Hostname of the OpenSearch instance. | -
187-
|`spring.ai.vectorstore.opensearch.aws.serviceName`| AWS service name for the OpenSearch instance. | -
188-
|`spring.ai.vectorstore.opensearch.aws.accessKey`| AWS access key for the OpenSearch instance. | -
189-
|`spring.ai.vectorstore.opensearch.aws.secretKey`| AWS secret key for the OpenSearch instance. | -
190-
|`spring.ai.vectorstore.opensearch.aws.region`| AWS region for the OpenSearch instance. | -
191-
|===
192-
193-
=== Customizing OpenSearch Client Configuration
194189
195-
In cases where the Spring Boot auto-configured OpenSearchClient with `Apache HttpClient 5 Transport` bean is not what
196-
you want or need, you can still define your own bean.
197-
Please read the link:https://opensearch.org/docs/latest/clients/java/[OpenSearch Java Client Documentation]
190+
// This can be any EmbeddingModel implementation
191+
@Bean
192+
public EmbeddingModel embeddingModel() {
193+
return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("OPENAI_API_KEY")));
194+
}
195+
----
198196

199197
== Metadata Filtering
200198

201199
You can leverage the generic, portable xref:api/vectordbs.adoc#metadata-filters[metadata filters] with OpenSearch as well.
202200

203201
For example, you can use either the text expression language:
204202

205-
[tabs]
206-
======
207-
SQL filter syntax::
208-
+
209203
[source,java]
210204
----
211-
vectorStore.similaritySearch(SearchRequest.defaults()
205+
vectorStore.similaritySearch(
206+
SearchRequest.defaults()
212207
.withQuery("The World")
213208
.withTopK(TOP_K)
214209
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
215210
.withFilterExpression("author in ['john', 'jill'] && 'article_type' == 'blog'"));
216211
----
217212

218-
`Filter.Expression` DSL::
219-
+
213+
or programmatically using the `Filter.Expression` DSL:
214+
220215
[source,java]
221216
----
222217
FilterExpressionBuilder b = new FilterExpressionBuilder();
223218
224219
vectorStore.similaritySearch(SearchRequest.defaults()
225-
.withQuery("The World")
226-
.withTopK(TOP_K)
227-
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
228-
.withFilterExpression(b.and(
229-
b.in("john", "jill"),
230-
b.eq("article_type", "blog")).build()));
220+
.withQuery("The World")
221+
.withTopK(TOP_K)
222+
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
223+
.withFilterExpression(b.and(
224+
b.in("author", "john", "jill"),
225+
b.eq("article_type", "blog")).build()));
231226
----
232-
======
233227

234228
NOTE: Those (portable) filter expressions get automatically converted into the proprietary OpenSearch link:https://opensearch.org/docs/latest/query-dsl/full-text/query-string/[Query string query].
235229

spring-ai-spring-boot-autoconfigure/src/main/java/org/springframework/ai/autoconfigure/vectorstore/opensearch/OpenSearchVectorStoreAutoConfiguration.java

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
import org.springframework.ai.embedding.BatchingStrategy;
4141
import org.springframework.ai.embedding.EmbeddingModel;
4242
import org.springframework.ai.embedding.TokenCountBatchingStrategy;
43-
import org.springframework.ai.vectorstore.OpenSearchVectorStore;
43+
import org.springframework.ai.vectorstore.opensearch.OpenSearchVectorStore;
4444
import org.springframework.ai.vectorstore.observation.VectorStoreObservationConvention;
4545
import org.springframework.beans.factory.ObjectProvider;
4646
import org.springframework.boot.autoconfigure.AutoConfiguration;
@@ -78,9 +78,17 @@ OpenSearchVectorStore vectorStore(OpenSearchVectorStoreProperties properties, Op
7878
var indexName = Optional.ofNullable(properties.getIndexName()).orElse(OpenSearchVectorStore.DEFAULT_INDEX_NAME);
7979
var mappingJson = Optional.ofNullable(properties.getMappingJson())
8080
.orElse(OpenSearchVectorStore.DEFAULT_MAPPING_EMBEDDING_TYPE_KNN_VECTOR_DIMENSION);
81-
return new OpenSearchVectorStore(indexName, openSearchClient, embeddingModel, mappingJson,
82-
properties.isInitializeSchema(), observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP),
83-
customObservationConvention.getIfAvailable(() -> null), batchingStrategy);
81+
82+
return OpenSearchVectorStore.builder()
83+
.index(indexName)
84+
.openSearchClient(openSearchClient)
85+
.embeddingModel(embeddingModel)
86+
.mappingJson(mappingJson)
87+
.initializeSchema(properties.isInitializeSchema())
88+
.observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
89+
.customObservationConvention(customObservationConvention.getIfAvailable(() -> null))
90+
.batchingStrategy(batchingStrategy)
91+
.build();
8492
}
8593

8694
@Configuration(proxyBeanMethods = false)

0 commit comments

Comments
 (0)