From f786c5a5b0d6031a769a66b013b6560df933ecc7 Mon Sep 17 00:00:00 2001
From: Kathleen DeRusso <kathleen.derusso@elastic.co>
Date: Wed, 25 Jun 2025 13:34:37 -0400
Subject: [PATCH 1/3] Update docs for 8.19

---
 .../mapping/types/semantic-text.asciidoc      | 49 +++++++++++++++----
 1 file changed, 40 insertions(+), 9 deletions(-)

diff --git a/docs/reference/mapping/types/semantic-text.asciidoc b/docs/reference/mapping/types/semantic-text.asciidoc
index 3325b1dc32d34..c0eb76668fc18 100644
--- a/docs/reference/mapping/types/semantic-text.asciidoc
+++ b/docs/reference/mapping/types/semantic-text.asciidoc
@@ -96,6 +96,11 @@ You can update this parameter by using the <<indices-put-mapping, Update mapping
 Use the <<put-inference-api>> to create the endpoint.
 If not specified, the {infer} endpoint defined by `inference_id` will be used at both index and query time.
 
+`index_options`::
+(Optional, object) Specifies the index options to override default values for the field.
+Currently, `dense_vector` index options are supported.
+For text embeddings, `index_options` may match any allowed <<dense-vector-index-options,dense vector index options>>.
+
 `chunking_settings`::
 (Optional, object) Settings for chunking text into smaller passages.
 If specified, these will override the chunking settings set in the {infer-cap} endpoint associated with `inference_id`.
@@ -124,9 +129,8 @@ The number of overlapping words allowed in chunks.
 Valid values are `0` or `1`.
 Required for `sentence` type chunking settings.
 
-WARNING: If the input exceeds the maximum token limit of the underlying model,  some services (such as OpenAI) may return an
-error. In contrast, the `elastic` and `elasticsearch` services  will automatically truncate the input to fit within the
-model's limit.
+WARNING: If the input exceeds the maximum token limit of the underlying model, some services (such as OpenAI) may return an error.
+In contrast, the `elastic` and `elasticsearch` services will automatically truncate the input to fit within the model's limit.
 
 ====
 
@@ -258,12 +262,39 @@ PUT test-index
 `semantic_text` uses defaults for indexing data based on the {infer} endpoint specified.
 It enables you to quickstart your semantic search by providing automatic {infer} and a dedicated query so you don't need to provide further details.
 
-In case you want to customize data indexing, use the
-<<sparse-vector,`sparse_vector`>> or <<dense-vector,`dense_vector`>> field types and create an ingest pipeline with an
-<<inference-processor, {infer} processor>> to generate the embeddings.
-<<semantic-search-inference,This tutorial>> walks you through the process.
-In these cases - when you use `sparse_vector` or `dense_vector` field types instead of the `semantic_text` field type to customize indexing - using the
-<<query-dsl-semantic-query,`semantic_query`>> is not supported for querying the field data.
+If you want to override those defaults and customize the embeddings that
+`semantic_text` stores, you can do so by modifying <<semantic-text-params, parameters>>:
+
+- Use `index_options` to specify alternate index options such as specific
+`dense_vector` quantization methods
+- Use `chunking_settings` to override the chunking strategy associated with the
+{{infer}} endpoint, or completely disable chunking using the `none` type
+
+Here is an example of how to set these parameters for a text embedding endpoint:
+
+[source,console]
+------------------------------------------------------------
+PUT my-index-000004
+{
+  "mappings": {
+    "properties": {
+      "inference_field": {
+        "type": "semantic_text",
+        "inference_id": "my-text-embedding-endpoint",
+        "index_options": {
+          "dense_vector": {
+            "type": "int4_flat"
+          }
+        },
+        "chunking_settings": {
+          "type": "none"
+        }
+      }
+    }
+  }
+}
+------------------------------------------------------------
+// TEST[skip:Requires inference endpoint]
 
 [discrete]
 [[update-script]]

From f52d95450e3ace049a483e05a4ece2e07bf58a5e Mon Sep 17 00:00:00 2001
From: Kathleen DeRusso <kathleen.derusso@elastic.co>
Date: Wed, 25 Jun 2025 15:00:07 -0400
Subject: [PATCH 2/3] Update
 docs/reference/mapping/types/semantic-text.asciidoc

Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co>
---
 docs/reference/mapping/types/semantic-text.asciidoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/reference/mapping/types/semantic-text.asciidoc b/docs/reference/mapping/types/semantic-text.asciidoc
index c0eb76668fc18..9c7d2e34eb930 100644
--- a/docs/reference/mapping/types/semantic-text.asciidoc
+++ b/docs/reference/mapping/types/semantic-text.asciidoc
@@ -129,7 +129,7 @@ The number of overlapping words allowed in chunks.
 Valid values are `0` or `1`.
 Required for `sentence` type chunking settings.
 
-WARNING: If the input exceeds the maximum token limit of the underlying model, some services (such as OpenAI) may return an error.
+WARNING: When using the `none` chunking strategy, if the input exceeds the maximum token limit of the underlying model, some services (such as OpenAI) may return an error.
 In contrast, the `elastic` and `elasticsearch` services will automatically truncate the input to fit within the model's limit.
 
 ====

From 5ea0e0642528ef81a8c57521e51be4246633fc7f Mon Sep 17 00:00:00 2001
From: Kathleen DeRusso <kathleen.derusso@elastic.co>
Date: Wed, 25 Jun 2025 15:04:36 -0400
Subject: [PATCH 3/3] PR feedback

---
 docs/reference/mapping/types/semantic-text.asciidoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/reference/mapping/types/semantic-text.asciidoc b/docs/reference/mapping/types/semantic-text.asciidoc
index 9c7d2e34eb930..4eb5814274382 100644
--- a/docs/reference/mapping/types/semantic-text.asciidoc
+++ b/docs/reference/mapping/types/semantic-text.asciidoc
@@ -263,7 +263,7 @@ PUT test-index
 It enables you to quickstart your semantic search by providing automatic {infer} and a dedicated query so you don't need to provide further details.
 
 If you want to override those defaults and customize the embeddings that
-`semantic_text` stores, you can do so by modifying <<semantic-text-params, parameters>>:
+`semantic_text` indexes, you can do so by modifying <<semantic-text-params, parameters>>:
 
 - Use `index_options` to specify alternate index options such as specific
 `dense_vector` quantization methods