From cfe8350ed5c34c6bbbe099f98986cf00abb1eaec Mon Sep 17 00:00:00 2001 From: David Kyle Date: Mon, 6 Jan 2025 13:12:16 +0000 Subject: [PATCH 1/4] Document the text expansion task type --- docs/reference/ml/ml-shared.asciidoc | 6 +++++ .../apis/put-trained-models.asciidoc | 23 +++++++++++++++++-- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/docs/reference/ml/ml-shared.asciidoc b/docs/reference/ml/ml-shared.asciidoc index 4948db48664ed..2121320a6f176 100644 --- a/docs/reference/ml/ml-shared.asciidoc +++ b/docs/reference/ml/ml-shared.asciidoc @@ -1167,6 +1167,12 @@ tag::inference-config-text-embedding-size[] The number of dimensions in the embedding vector produced by the model. end::inference-config-text-embedding-size[] +tag::inference-config-text-expansion[] +The Text expansion task works with Sparse Embedding models to transform an input sequence +into a vector of weighted tokens. These embeddings capture semantic meanings and +context and can be used in a <> field for powerful insights. +end::inference-config-text-expansion[] + tag::inference-config-text-similarity[] Text similarity takes an input sequence and compares it with another input sequence. This is commonly referred to as cross-encoding. This task is useful for ranking document text when comparing it to another provided text input. diff --git a/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc b/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc index 256c0d29f8d2f..c024cf22d3027 100644 --- a/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc +++ b/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc @@ -395,10 +395,10 @@ the model definition is not supplied. (Required, object) The default configuration for inference. This can be: `regression`, `classification`, `fill_mask`, `ner`, `question_answering`, -`text_classification`, `text_embedding` or `zero_shot_classification`. +`text_classification`, `text_embedding`, `text_expansion` or `zero_shot_classification`. If `regression` or `classification`, it must match the `target_type` of the underlying `definition.trained_model`. If `fill_mask`, `ner`, -`question_answering`, `text_classification`, or `text_embedding`; the +`question_answering`, `text_classification`, `text_embedding` or `text_expansion`; the `model_type` must be `pytorch`. + .Properties of `inference_config` @@ -592,6 +592,25 @@ Refer to <> to review the properties of the `tokenization` object. ===== +`text_expansion`::: +(Object, optional) +include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-text_expansion] ++ +.Properties of text_expansion inference +[%collapsible%open] +===== +`results_field`:::: +(Optional, string) +include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field] + +`tokenization`:::: +(Optional, object) +include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization] ++ +Refer to <> to review the properties of the +`tokenization` object. +===== + `text_similarity`::: (Object, optional) include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-similarity] From f193aea43a737008cfb70fbead8d1e6f95582566 Mon Sep 17 00:00:00 2001 From: David Kyle Date: Tue, 7 Jan 2025 21:49:44 +0000 Subject: [PATCH 2/4] Update docs/reference/ml/ml-shared.asciidoc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: István Zoltán Szabó --- docs/reference/ml/ml-shared.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/ml/ml-shared.asciidoc b/docs/reference/ml/ml-shared.asciidoc index 2121320a6f176..af384c2c90011 100644 --- a/docs/reference/ml/ml-shared.asciidoc +++ b/docs/reference/ml/ml-shared.asciidoc @@ -1168,7 +1168,7 @@ The number of dimensions in the embedding vector produced by the model. end::inference-config-text-embedding-size[] tag::inference-config-text-expansion[] -The Text expansion task works with Sparse Embedding models to transform an input sequence +The text expansion task works with sparse embedding models to transform an input sequence into a vector of weighted tokens. These embeddings capture semantic meanings and context and can be used in a <> field for powerful insights. end::inference-config-text-expansion[] From 6a4ce96d549f452b658c8c41493c99d20bbb5c65 Mon Sep 17 00:00:00 2001 From: David Kyle Date: Wed, 8 Jan 2025 12:20:13 +0000 Subject: [PATCH 3/4] Update docs/reference/ml/trained-models/apis/put-trained-models.asciidoc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: István Zoltán Szabó --- .../ml/trained-models/apis/put-trained-models.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc b/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc index c024cf22d3027..61aa4ff29c55a 100644 --- a/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc +++ b/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc @@ -594,7 +594,7 @@ Refer to <> to review the properties of the `text_expansion`::: (Object, optional) -include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-text_expansion] +include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-text-expansion] + .Properties of text_expansion inference [%collapsible%open] From 9f5c3b5a3ea9c484adf6d208831ac40acb40d9ab Mon Sep 17 00:00:00 2001 From: David Kyle Date: Thu, 9 Jan 2025 14:46:34 +0000 Subject: [PATCH 4/4] Update docs/reference/ml/trained-models/apis/put-trained-models.asciidoc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: István Zoltán Szabó --- .../ml/trained-models/apis/put-trained-models.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc b/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc index 61aa4ff29c55a..ccd76b7095762 100644 --- a/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc +++ b/docs/reference/ml/trained-models/apis/put-trained-models.asciidoc @@ -594,7 +594,7 @@ Refer to <> to review the properties of the `text_expansion`::: (Object, optional) -include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-text-expansion] +include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-expansion] + .Properties of text_expansion inference [%collapsible%open]