Skip to content

Commit 4af241b

Browse files
kosabogileemthompo
andauthored
Adds note on reindexing existing data for semantic_text usage (#113590)
* Adds note on reindexing existing data for semantic_text usage * Adds note about full crawl and full sync * Style guide related fix * Update docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc Co-authored-by: Liam Thompson <[email protected]> --------- Co-authored-by: Liam Thompson <[email protected]>
1 parent bb9d612 commit 4af241b

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,16 @@ PUT semantic-embeddings
8989
It will be used to generate the embeddings based on the input text.
9090
Every time you ingest data into the related `semantic_text` field, this endpoint will be used for creating the vector representation of the text.
9191

92+
[NOTE]
93+
====
94+
If you're using web crawlers or connectors to generate indices, you have to
95+
<<indices-put-mapping,update the index mappings>> for these indices to
96+
include the `semantic_text` field. Once the mapping is updated, you'll need to run
97+
a full web crawl or a full connector sync. This ensures that all existing
98+
documents are reprocessed and updated with the new semantic embeddings,
99+
enabling semantic search on the updated data.
100+
====
101+
92102

93103
[discrete]
94104
[[semantic-text-load-data]]
@@ -118,6 +128,13 @@ Create the embeddings from the text by reindexing the data from the `test-data`
118128
The data in the `content` field will be reindexed into the `content` semantic text field of the destination index.
119129
The reindexed data will be processed by the {infer} endpoint associated with the `content` semantic text field.
120130

131+
[NOTE]
132+
====
133+
This step uses the reindex API to simulate data ingestion. If you are working with data that has already been indexed,
134+
rather than using the test-data set, reindexing is required to ensure that the data is processed by the {infer} endpoint
135+
and the necessary embeddings are generated.
136+
====
137+
121138
[source,console]
122139
------------------------------------------------------------
123140
POST _reindex?wait_for_completion=false

0 commit comments

Comments
 (0)