@@ -407,10 +407,73 @@ each score in the sum. In the example above, the scores will be calculated as
407407score = 0.9 * match_score + 0.1 * knn_score
408408```
409409
410- The `knn` option can also be used with <<search-aggregations, `aggregations`>>. In general, {es} computes aggregations
411- over all documents that match the search. So for approximate kNN search, aggregations are calculated on the top `k`
412- nearest documents. If the search also includes a `query`, then aggregations are calculated on the combined set of `knn`
413- and `query` matches.
410+ The `knn` option can also be used with <<search-aggregations, `aggregations`>>.
411+ In general, {es} computes aggregations over all documents that match the search.
412+ So for approximate kNN search, aggregations are calculated on the top `k`
413+ nearest documents. If the search also includes a `query`, then aggregations are
414+ calculated on the combined set of `knn` and `query` matches.
415+
416+ [discrete]
417+ [[semantic-search]]
418+ ==== Perform semantic search
419+
420+ kNN search enables you to perform semantic search by using a previously deployed
421+ {ml-docs}/ml-nlp-search-compare.html#ml-nlp-text-embedding[text embedding model].
422+ Instead of literal matching on search terms, semantic search retrieves results
423+ based on the intent and the contextual meaning of a search query.
424+
425+ Under the hood, the text embedding NLP model generates a dense vector from the
426+ input query string called `model_text` you provide. Then, it is searched
427+ against an index containing dense vectors created with the same text embedding
428+ {ml} model. The search results are semantically similar as learned by the model.
429+
430+ [IMPORTANT]
431+ =====================
432+ To perform semantic search:
433+
434+ * you need an index that contains the dense vector representation of the input
435+ data to search against,
436+
437+ * you must use the same text embedding model for search that you used to create
438+ the dense vectors from the input data,
439+
440+ * the text embedding NLP model deployment must be started.
441+ =====================
442+
443+ Reference the deployed text embedding model in the `query_vector_builder` object
444+ and provide the search query as `model_text`:
445+
446+ [source,js]
447+ ----
448+ (...)
449+ {
450+ "knn": {
451+ "field": "dense-vector-field",
452+ "k": 10,
453+ "num_candidates": 100,
454+ "query_vector_builder": {
455+ "text_embedding": { <1>
456+ "model_id": "my-text-embedding-model", <2>
457+ "model_text": "The opposite of blue" <3>
458+ }
459+ }
460+ }
461+ }
462+ (...)
463+ ----
464+ // NOTCONSOLE
465+
466+ <1> The {nlp} task to perform. It must be `text_embedding`.
467+ <2> The ID of the text embedding model to use to generate the dense vectors from
468+ the query string. Use the same model that generated the embeddings from the
469+ input text in the index you search against.
470+ <3> The query string from which the model generates the dense vector
471+ representation.
472+
473+ For more information on how to deploy a trained model and use it to create text
474+ embeddings, refer to this
475+ {ml-docs}/ml-nlp-text-emb-vector-search-example.html[end-to-end example].
476+
414477
415478[discrete]
416479==== Search multiple kNN fields
0 commit comments