You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Notice that among the results, there are some animation family movies, such as _Curious George_ and _Bambi_, which makes sense, since the target vector was created with two other animation family movies (_Finding Nemo_ and _Bee Movie_).
149
+
- We also notice that among the results there are two movies that the person already watched. In the next example we will filter them out.
150
+
151
+
**KNN query with Filter Query** - Search for the top 10 movies most similar to the resulting vector, excluding the movies already watched:
- There are 3 "cinderella" movies in the index, but only 1 is among the top 50 most similar to the target vector (_Cinderella III: A Twist in Time_).
147
160
148
-
* Notice that among the results, there are some animation family movies, such as _Curious George_ and _Bambi_, which makes sense, since the target vector was created with two other animation family movies (_Finding Nemo_ and _Bee Movie_).
149
-
* We also notice that among the results there are two movies that the person already watched. In the next example we will filter them out.
161
+
*KNN with SeededQuery* - Search for the top 10 movies most similar to the target vector, guided by a seed lexical query on the `genre` field, which provides the initial entry points in the vector graph search:
150
162
151
-
Search for the top 10 movies most similar to the resulting vector, excluding the movies already watched (KNN query with Filter Query):
- This allows the KNN algorithm to start the similarity exploration from documents that already match the lexical criteria, potentially improving relevance and reducing search time.
154
166
155
-
- Search for movies with "cinderella" in the name among the top 50 movies most similar to the target vector (KNN as Filter Query):
167
+
*KNN with EarlyTermination* - Search for the top 10 movies most similar to the target vector, allowing the KNN search to stop early for lower latency:
* There are 3 "cinderella" movies in the index, but only 1 is among the top 50 most similar to the target vector (_Cinderella III: A Twist in Time_).
171
+
- This allows Solr to return results faster by stopping the graph search once a good enough set of neighbors is found, instead of exploring all nodes in the vector index.
160
172
161
-
- Search for movies with "animation" in the genre, and rerank the top 5 documents by combining (sum) the original query score with twice (2x) the similarity to the target vector (KNN with ReRanking):
173
+
**KNN with ReRanking** - Search for movies with "animation" in the genre, and rerank the top 5 documents by combining (sum) the original query score with twice (2x) the similarity to the target vector:
* To guarantee we calculate the vector similarity score for all the movies, we set `topK=10000`, a number higher than the total number of documents (`1100`).
177
+
- To guarantee we calculate the vector similarity score for all the movies, we set `topK=10000`, a number higher than the total number of documents (`1100`).
166
178
167
-
* It's possible to combine the vector similarity scores with other scores, by using Sub-query,
168
-
xref:query-guide:function-queries.adoc[Function Queries] and xref:query-guide:local-params.adoc#parameter-dereferencing[Parameter Dereferencing] Solr features:
179
+
It's possible to combine the vector similarity scores with other scores, by using Sub-query, xref:query-guide:function-queries.adoc[Function Queries] and xref:query-guide:local-params.adoc#parameter-dereferencing[Parameter Dereferencing] Solr features:
169
180
170
181
- Search for "harry potter" movies, ranking the results by the similarity to the target vector instead of the lexical query score. Beside the `q` parameter, we define a "sub-query" named `q_vector`, that will calculate the similarity score between all the movies (since we set `topK=10000`). Then we use the sub-query parameter name as input for the `sort`, specifying that we want to rank descending according to the vector similarity score (`sort=$q_vector desc`):
- Search for movies with "the" in the name, keeping the original lexical query ranking, but returning only movies with similarity to the target vector of 0.8 or higher. Like previously, we define the sub-query `q_vector`, but this time we use it as input for the `frange` filter, specifying that we want documents with at least 0.8 of vector similarity score:
- Search for "batman" movies, ranking the results by combining 70% of the original lexical query score and 30% of the similarity to the target vector. Besides the `q` main query and the `q_vector` sub-query, we also specify the `q_lexical` query, which will hold the lexical score of the main `q` query. Then we specify a parameter variable called `score_combined`, which scales the lexical and similarity scores, applies the 0.7 and 0.3 weights, then sum the result. We set the `sort` parameter to order according the combined score, and also set the `fl` parameter so that we can view the intermediary and the combined score values in the response:
0 commit comments