You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clarify partial updates for semantic text (#132485)
This commit clarifies the behaviour of the semantic text field with partial updates.
It also removes the reference to ingest pipeline since semantic text is fully customizable now.
Copy file name to clipboardExpand all lines: docs/reference/elasticsearch/mapping-reference/semantic-text.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -359,6 +359,24 @@ PUT test-index
359
359
360
360
1. Ensures that highlighting is applied exclusively to semantic_text fields.
361
361
362
+
## Updates and partial updates for `semantic_text` fields [semantic-text-updates]
363
+
364
+
When updating documents that contain `semantic_text` fields, it’s important to understand how inference is triggered:
365
+
366
+
***Full document updates**
367
+
When you perform a full document update, **all `semantic_text` fields will re-run inference** even if their values did not change. This ensures that the embeddings are always consistent with the current document state but can increase ingestion costs.
368
+
369
+
***Partial updates using the Bulk API**
370
+
Partial updates that **omit `semantic_text` fields** and are submitted through the [Bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk) will **reuse the existing embeddings** stored in the index. In this case, inference is **not triggered** for fields that were not updated, which can significantly reduce processing time and cost.
371
+
372
+
***Partial updates using the Update API**
373
+
When using the [Update API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) with a `doc` object that **omits `semantic_text` fields**, inference **will still run** on all `semantic_text` fields. This means that even if the field values are not changed, embeddings will be re-generated.
374
+
375
+
If you want to avoid unnecessary inference and keep existing embeddings:
376
+
377
+
* Use **partial updates through the Bulk API**.
378
+
* Omit any `semantic_text` fields that did not change from the `doc` object in your request.
0 commit comments