elastic
diff --git a/‎docs/changelog/132387.yaml‎
Lines changed: 6 additions & 0 deletions b/‎docs/changelog/132387.yaml‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/changelog/132410.yaml‎
Lines changed: 5 additions & 0 deletions b/‎docs/changelog/132410.yaml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/changelog/132497.yaml‎
Lines changed: 5 additions & 0 deletions b/‎docs/changelog/132497.yaml‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/reference/elasticsearch/mapping-reference/semantic-text.md‎
Lines changed: 71 additions & 39 deletions b/‎docs/reference/elasticsearch/mapping-reference/semantic-text.md‎
Lines changed: 71 additions & 39 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/commands/examples/rerank.csv-spec/combine.md‎
Lines changed: 19 additions & 0 deletions b/‎docs/reference/query-languages/esql/_snippets/commands/examples/rerank.csv-spec/combine.md‎
Lines changed: 19 additions & 0 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/commands/examples/rerank.csv-spec/simple-query.md‎
Lines changed: 17 additions & 0 deletions b/‎docs/reference/query-languages/esql/_snippets/commands/examples/rerank.csv-spec/simple-query.md‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/commands/examples/rerank.csv-spec/two-queries.md‎
Lines changed: 18 additions & 0 deletions b/‎docs/reference/query-languages/esql/_snippets/commands/examples/rerank.csv-spec/two-queries.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/commands/layout/rerank.md‎
Lines changed: 6 additions & 50 deletions b/‎docs/reference/query-languages/esql/_snippets/commands/layout/rerank.md‎
Lines changed: 6 additions & 50 deletions
diff --git a/‎docs/reference/query-languages/esql/_snippets/functions/layout/copy_sign.md‎
Lines changed: 3 additions & 0 deletions b/‎docs/reference/query-languages/esql/_snippets/functions/layout/copy_sign.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/release-notes/breaking-changes.md‎
Lines changed: 7 additions & 0 deletions b/‎docs/release-notes/breaking-changes.md‎
Lines changed: 7 additions & 0 deletions
@@ -0,0 +1,6 @@
+pr: 132387
+summary: "[ExtraHop & QualysGAV] Add `manage`, `create_index`, `read`, `index`, `write`, `delete`, permission for third party agent indices `kibana_system`"
+area: Authorization
+type: enhancement
+issues:
+  - 131825
@@ -0,0 +1,5 @@
+pr: 132410
+summary: Add support for retrieving semantic_text's indexed chunks via fields API
+area: Vector Search
+type: feature
+issues: []
@@ -0,0 +1,5 @@
+pr: 132497
+summary: Add cache miss and read metrics
+area: Searchable Snapshots
+type: enhancement
+issues: []
@@ -282,6 +282,34 @@ PUT test-index/_doc/1
     * Others (such as `elastic` and `elasticsearch`) will automatically truncate
       the input.
 
+## Retrieving indexed chunks
+```{applies_to}
+stack: ga 9.2
+serverless: ga
+```
+
+You can retrieve the individual chunks generated by your semantic field’s chunking
+strategy using the [fields parameter](/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param):
+
+```console
+POST test-index/_search
+{
+  "query": {
+    "ids" : {
+      "values" : ["1"]
+    }
+  },
+  "fields": [
+    {
+      "field": "semantic_text_field",
+      "format": "chunks"      <1>
+    }
+  ]
+}
+```
+
+1. Use `"format": "chunks"` to return the field’s text as the original text chunks that were indexed.
+
 ## Extracting relevant fragments from semantic text [semantic-text-highlighting]
 
 You can extract the most relevant fragments from a semantic text field by using
@@ -311,27 +339,6 @@ POST test-index/_search
 2. Sorts the most relevant highlighted fragments by score when set to `score`. By default,
    fragments will be output in the order they appear in the field (order: none).
 
-To use the `semantic` highlighter to view chunks in the order which they were indexed with no scoring,
-use the `match_all` query to retrieve them in the order they appear in the document:
-
-```console
-POST test-index/_search
-{
-    "query": {
-        "match_all": {}
-    },
-    "highlight": {
-        "fields": {
-            "my_semantic_field": {
-                "number_of_fragments": 5  <1>
-            }
-        }
-    }
-}
-```
-
-1. This will return the first 5 chunks, set this number higher to retrieve more chunks.
-
 Highlighting is supported on fields other than semantic_text. However, if you
 want to restrict highlighting to the semantic highlighter and return no
 fragments when the field is not of type semantic_text, you can explicitly
@@ -359,6 +366,49 @@ PUT test-index
 
 1. Ensures that highlighting is applied exclusively to semantic_text fields.
 
+To retrieve all fragments from the `semantic` highlighter in their original indexing order
+without scoring, use a `match_all` query as the `highlight_query`.
+This ensures fragments are returned in the order they appear in the document:
+
+```console
+POST test-index/_search
+{
+  "query": {
+    "ids": {
+      "values": ["1"]
+    }
+  },
+  "highlight": {
+    "fields": {
+      "my_semantic_field": {
+        "number_of_fragments": 5,        <1>
+        "highlight_query": { "match_all": {} }
+      }
+    }
+  }
+}
+```
+
+1. Returns the first 5 fragments. Increase this value to retrieve additional fragments.
+
+## Updates and partial updates for `semantic_text` fields [semantic-text-updates]
+
+When updating documents that contain `semantic_text` fields, it’s important to understand how inference is triggered:
+
+* **Full document updates**
+  When you perform a full document update, **all `semantic_text` fields will re-run inference** even if their values did not change. This ensures that the embeddings are always consistent with the current document state but can increase ingestion costs.
+
+* **Partial updates using the Bulk API**
+  Partial updates that **omit `semantic_text` fields** and are submitted through the [Bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk) will **reuse the existing embeddings** stored in the index. In this case, inference is **not triggered** for fields that were not updated, which can significantly reduce processing time and cost.
+
+* **Partial updates using the Update API**
+  When using the [Update API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) with a `doc` object that **omits `semantic_text` fields**, inference **will still run** on all `semantic_text` fields. This means that even if the field values are not changed, embeddings will be re-generated.
+
+If you want to avoid unnecessary inference and keep existing embeddings:
+
+    * Use **partial updates through the Bulk API**.
+    * Omit any `semantic_text` fields that did not change from the `doc` object in your request.
+
 ## Customizing `semantic_text` indexing [custom-indexing]
 
 `semantic_text` uses defaults for indexing data based on the {{infer}} endpoint
@@ -404,24 +454,6 @@ PUT my-index-000004
 }
 ```
 
-### Customizing using ingest pipelines [custom-by-pipelines]
-```{applies_to}
-stack: ga 9.0
-```
-
-In case you want to customize data indexing, use the
-[`sparse_vector`](/reference/elasticsearch/mapping-reference/sparse-vector.md)
-or [`dense_vector`](/reference/elasticsearch/mapping-reference/dense-vector.md)
-field types and create an ingest pipeline with an
-[{{infer}} processor](/reference/enrich-processor/inference-processor.md) to
-generate the embeddings.
-[This tutorial](docs-content://solutions/search/semantic-search/semantic-search-inference.md)
-walks you through the process. In these cases - when you use `sparse_vector` or
-`dense_vector` field types instead of the `semantic_text` field type to
-customize indexing - using the
-[`semantic_query`](/reference/query-languages/query-dsl/query-dsl-semantic-query.md)
-is not supported for querying the field data.
-
 ## Updates to `semantic_text` fields [update-script]
 
 For indices containing `semantic_text` fields, updates that use scripts have the
 
@@ -0,0 +1,19 @@
+% This is generated by ESQL's AbstractFunctionTestCase. Do not edit it. See ../README.md for how to regenerate it.
+
+```esql
+FROM books METADATA _score
+| WHERE MATCH(description, "hobbit") OR MATCH(author, "Tolkien")
+| SORT _score DESC
+| LIMIT 100
+| RERANK rerank_score = "hobbit" ON description, author WITH { "inference_id" : "test_reranker" }
+| EVAL original_score = _score, _score = rerank_score + original_score
+| SORT _score
+| LIMIT 3
+| KEEP title, original_score, rerank_score, _score
+```
+
+| title:text | _score:double | rerank_score:double | rerank_score:double |
+| --- | --- | --- | --- |
+| Poems from the Hobbit | 4.012462615966797 | 0.001396648003719747 | 0.001396648003719747 |
+| The Lord of the Rings - Boxed Set | 3.768855094909668 | 0.0010020040208473802 | 0.001396648003719747 |
+| Return of the King Being the Third Part of The Lord of the Rings | 3.6248698234558105 | 9.000900317914784E-4 | 0.001396648003719747 |
@@ -0,0 +1,17 @@
+% This is generated by ESQL's AbstractFunctionTestCase. Do not edit it. See ../README.md for how to regenerate it.
+
+```esql
+FROM books METADATA _score
+| WHERE MATCH(description, "hobbit")
+| SORT _score DESC
+| LIMIT 100
+| RERANK "hobbit" ON description WITH { "inference_id" : "test_reranker" }
+| LIMIT 3
+| KEEP title, _score
+```
+
+| title:text | _score:double |
+| --- | --- |
+| Poems from the Hobbit | 0.0015673980815336108 |
+| A Tolkien Compass: Including J. R. R. Tolkien's Guide to the Names in The Lord of the Rings | 0.007936508394777775 |
+| Return of the King Being the Third Part of The Lord of the Rings | 9.960159659385681E-4 |
@@ -0,0 +1,18 @@
+% This is generated by ESQL's AbstractFunctionTestCase. Do not edit it. See ../README.md for how to regenerate it.
+
+```esql
+FROM books METADATA _score
+| WHERE MATCH(description, "hobbit") OR MATCH(author, "Tolkien")
+| SORT _score DESC
+| LIMIT 100
+| RERANK rerank_score = "hobbit" ON description, author WITH { "inference_id" : "test_reranker" }
+| SORT rerank_score
+| LIMIT 3
+| KEEP title, _score, rerank_score
+```
+
+| title:text | _score:double | rerank_score:double |
+| --- | --- | --- |
+| Return of the Shadow | 2.8181066513061523 | 5.740527994930744E-4 |
+| Return of the King Being the Third Part of The Lord of the Rings | 3.6248698234558105 | 9.000900317914784E-4 |
+| The Lays of Beleriand | 1.3002015352249146 | 9.36329597607255E-4 |
@@ -100,61 +100,17 @@ If you don't want to increase the timeout limit, try the following:
 
 Rerank search results using a simple query and a single field:
 
-```esql
-FROM books
-| WHERE MATCH(title, "science fiction")
-| SORT _score DESC
-| LIMIT 100
-| RERANK "science fiction" ON (title) WITH { "inference_id" : "my_reranker" }
-| LIMIT 3
-| KEEP title, _score
-```
 
-| title:keyword | _score:double |
-|---------------|---------------|
-| Neuromancer   | 0.98          |
-| Dune          | 0.95          |
-| Foundation    | 0.92          |
+:::{include} ../examples/rerank.csv-spec/simple-query.md
+:::
 
 Rerank search results using a query and multiple fields, and store the new score
 in a column named `rerank_score`:
 
-```esql
-FROM movies
-| WHERE MATCH(title, "dystopian future") OR MATCH(synopsis, "dystopian future")
-| SORT _score DESC
-| LIMIT 100
-| RERANK rerank_score = "dystopian future" ON (title, synopsis) WITH { "inference_id" : "my_reranker" }
-| SORT rerank_score DESC
-| LIMIT 5
-| KEEP title, _score, rerank_score
-```
-
-| title:keyword   | _score:double | rerank_score:double |
-|-----------------|---------------|---------------------|
-| Blade Runner    | 8.75          | 0.99                |
-| The Matrix      | 9.12          | 0.97                |
-| Children of Men | 8.50          | 0.96                |
-| Akira           | 8.99          | 0.94                |
-| Gattaca         | 8.65          | 0.91                |
+:::{include} ../examples/rerank.csv-spec/two-queries.md
+:::
 
 Combine the original score with the reranked score:
 
-```esql
-FROM movies
-| WHERE MATCH(title, "dystopian future") OR MATCH(synopsis, "dystopian future")
-| SORT _score DESC
-| LIMIT 100
-| RERANK rerank_score = "dystopian future" ON (title, synopsis) WITH { "inference_id" : "my_reranker" }
-| EVAL original_score = _score, _score = rerank_score + original_score
-| SORT _score DESC
-| LIMIT 2
-| KEEP title, original_score, rerank_score, _score
-```
-
-| title:keyword | original_score:double | rerank_score:double | _score:double |
-|---------------|-----------------------|---------------------|---------------|
-| The Matrix    | 9.12                  | 0.97                | 10.09         |
-| Akira         | 8.99                  | 0.94                | 9.93          |
-
-
+:::{include} ../examples/rerank.csv-spec/combine.md
+:::
@@ -12,6 +12,13 @@ If you are migrating from a version prior to version 9.0, you must first upgrade
 
 % ## Next version [elasticsearch-nextversion-breaking-changes]
 
+```{applies_to}
+stack: coming 9.1.1
+```
+## 9.1.1 [elasticsearch-9.1.1-breaking-changes]
+
+There are no breaking changes associated with this release.
+
 ## 9.1.0 [elasticsearch-9.1.0-breaking-changes]
 
 Discovery-Plugins: