Skip to content

Commit 4d117c5

Browse files
[DOCS] Adds semantic search section to kNN search page (#93782)
Co-authored-by: Abdon Pijpelink <[email protected]>
1 parent 92c533e commit 4d117c5

File tree

2 files changed

+70
-6
lines changed

2 files changed

+70
-6
lines changed

docs/reference/search/search-your-data/knn-search.asciidoc

Lines changed: 67 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -407,10 +407,73 @@ each score in the sum. In the example above, the scores will be calculated as
407407
score = 0.9 * match_score + 0.1 * knn_score
408408
```
409409

410-
The `knn` option can also be used with <<search-aggregations, `aggregations`>>. In general, {es} computes aggregations
411-
over all documents that match the search. So for approximate kNN search, aggregations are calculated on the top `k`
412-
nearest documents. If the search also includes a `query`, then aggregations are calculated on the combined set of `knn`
413-
and `query` matches.
410+
The `knn` option can also be used with <<search-aggregations, `aggregations`>>.
411+
In general, {es} computes aggregations over all documents that match the search.
412+
So for approximate kNN search, aggregations are calculated on the top `k`
413+
nearest documents. If the search also includes a `query`, then aggregations are
414+
calculated on the combined set of `knn` and `query` matches.
415+
416+
[discrete]
417+
[[semantic-search]]
418+
==== Perform semantic search
419+
420+
kNN search enables you to perform semantic search by using a previously deployed
421+
{ml-docs}/ml-nlp-search-compare.html#ml-nlp-text-embedding[text embedding model].
422+
Instead of literal matching on search terms, semantic search retrieves results
423+
based on the intent and the contextual meaning of a search query.
424+
425+
Under the hood, the text embedding NLP model generates a dense vector from the
426+
input query string called `model_text` you provide. Then, it is searched
427+
against an index containing dense vectors created with the same text embedding
428+
{ml} model. The search results are semantically similar as learned by the model.
429+
430+
[IMPORTANT]
431+
=====================
432+
To perform semantic search:
433+
434+
* you need an index that contains the dense vector representation of the input
435+
data to search against,
436+
437+
* you must use the same text embedding model for search that you used to create
438+
the dense vectors from the input data,
439+
440+
* the text embedding NLP model deployment must be started.
441+
=====================
442+
443+
Reference the deployed text embedding model in the `query_vector_builder` object
444+
and provide the search query as `model_text`:
445+
446+
[source,js]
447+
----
448+
(...)
449+
{
450+
"knn": {
451+
"field": "dense-vector-field",
452+
"k": 10,
453+
"num_candidates": 100,
454+
"query_vector_builder": {
455+
"text_embedding": { <1>
456+
"model_id": "my-text-embedding-model", <2>
457+
"model_text": "The opposite of blue" <3>
458+
}
459+
}
460+
}
461+
}
462+
(...)
463+
----
464+
// NOTCONSOLE
465+
466+
<1> The {nlp} task to perform. It must be `text_embedding`.
467+
<2> The ID of the text embedding model to use to generate the dense vectors from
468+
the query string. Use the same model that generated the embeddings from the
469+
input text in the index you search against.
470+
<3> The query string from which the model generates the dense vector
471+
representation.
472+
473+
For more information on how to deploy a trained model and use it to create text
474+
embeddings, refer to this
475+
{ml-docs}/ml-nlp-text-emb-vector-search-example.html[end-to-end example].
476+
414477

415478
[discrete]
416479
==== Search multiple kNN fields

docs/reference/search/search.asciidoc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -511,8 +511,9 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=knn-query-vector]
511511
512512
`query_vector_builder`::
513513
(Optional, object)
514-
A configuration object indicating how to build a query_vector before executing the request. You must provide
515-
a `query_vector_builder` or `query_vector`, but not both.
514+
A configuration object indicating how to build a query_vector before executing
515+
the request. You must provide a `query_vector_builder` or `query_vector`, but
516+
not both. Refer to <<semantic-search>> to learn more.
516517
517518
====
518519

0 commit comments

Comments
 (0)