Skip to content

Commit 9d19181

Browse files
committed
MariaDBVectorStore similarity score reference docs
Auto-cherry-pick to 1.0.x Signed-off-by: Soby Chacko <[email protected]>
1 parent 4e486c1 commit 9d19181

File tree

1 file changed

+72
-2
lines changed
  • spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs

1 file changed

+72
-2
lines changed

spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs/mariadb.adoc

Lines changed: 72 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,8 @@ The vector store implementation can initialize the required schema for you, but
4747

4848
NOTE: This is a breaking change! In earlier versions of Spring AI, this schema initialization happened by default.
4949

50-
Additionally, you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
50+
Additionally, you will need a configured `EmbeddingModel` bean.
51+
Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
5152

5253
For example, to use the xref:api/embeddings/openai-embeddings.adoc[OpenAI EmbeddingModel], add the following dependency:
5354

@@ -126,7 +127,8 @@ This ensures the correctness of the names and reduces the risk of SQL injection
126127

127128
== Manual Configuration
128129

129-
Instead of using the Spring Boot auto-configuration, you can manually configure the MariaDB vector store. For this you need to add the following dependencies to your project:
130+
Instead of using the Spring Boot auto-configuration, you can manually configure the MariaDB vector store.
131+
For this you need to add the following dependencies to your project:
130132

131133
[source,xml]
132134
----
@@ -211,6 +213,74 @@ vectorStore.similaritySearch(SearchRequest.builder()
211213

212214
NOTE: These filter expressions are automatically converted into the equivalent MariaDB JSON path expressions.
213215

216+
== Similarity Scores
217+
218+
The MariaDB Vector Store automatically calculates similarity scores for documents returned from similarity searches.
219+
These scores provide a normalized measure of how closely each document matches your search query.
220+
221+
=== Score Calculation
222+
223+
Similarity scores are calculated using the formula `score = 1.0 - distance`, where:
224+
225+
* Score: A value between `0.0` and `1.0`, where `1.0` indicates perfect similarity and `0.0` indicates no similarity
226+
* Distance: The raw distance value calculated using the configured distance type (`COSINE` or `EUCLIDEAN`)
227+
228+
This means that documents with smaller distances (more similar) will have higher scores, making the results more intuitive to interpret.
229+
230+
=== Accessing Scores
231+
232+
You can access the similarity score for each document through the `getScore()` method:
233+
234+
[source,java]
235+
----
236+
List<Document> results = vectorStore.similaritySearch(
237+
SearchRequest.builder()
238+
.query("Spring AI")
239+
.topK(5)
240+
.build());
241+
242+
for (Document doc : results) {
243+
double score = doc.getScore(); // Value between 0.0 and 1.0
244+
System.out.println("Document: " + doc.getText());
245+
System.out.println("Similarity Score: " + score);
246+
}
247+
----
248+
249+
=== Search Results Ordering
250+
251+
Search results are automatically ordered by similarity score in descending order (highest score first).
252+
This ensures that the most relevant documents appear at the top of your results.
253+
254+
=== Distance Metadata
255+
256+
In addition to the similarity score, the raw distance value is still available in the document metadata:
257+
258+
[source,java]
259+
----
260+
for (Document doc : results) {
261+
double score = doc.getScore();
262+
float distance = (Float) doc.getMetadata().get("distance");
263+
264+
System.out.println("Score: " + score + ", Distance: " + distance);
265+
}
266+
----
267+
268+
=== Similarity Threshold
269+
270+
When using similarity thresholds in your search requests, specify the threshold as a score value (`0.0` to `1.0`) rather than a distance:
271+
272+
[source,java]
273+
----
274+
List<Document> results = vectorStore.similaritySearch(
275+
SearchRequest.builder()
276+
.query("Spring AI")
277+
.topK(10)
278+
.similarityThreshold(0.8) // Only return documents with score >= 0.8
279+
.build());
280+
----
281+
282+
This makes threshold values consistent and intuitive - higher values mean more restrictive searches that only return highly similar documents.
283+
214284
== Accessing the Native Client
215285

216286
The MariaDB Vector Store implementation provides access to the underlying native JDBC client (`JdbcTemplate`) through the `getNativeClient()` method:

0 commit comments

Comments
 (0)