Rename the text field in the auto-generated embeddings example (#11554)

kolchfa-aws · web-flow · commit d30a678fb24e · 2025-11-17T09:20:01.000-05:00
Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;
diff --git a/_vector-search/getting-started/auto-generated-embeddings.md b/_vector-search/getting-started/auto-generated-embeddings.md
@@ -101,7 +101,7 @@ You'll need the model ID in order to use this model for several of the following
 
 ### Step 2: Create an ingest pipeline
 
-First, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains one processor: a task that transforms document fields before documents are ingested into an index. You'll set up a `text_embedding` processor that creates vector embeddings from text. You'll need the `model_id` of the model you set up in the previous section and a `field_map`, which specifies the name of the field from which to take the text (`text`) and the name of the field in which to record embeddings (`passage_embedding`):
+First, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains one processor: a task that transforms document fields before documents are ingested into an index. You'll set up a `text_embedding` processor that creates vector embeddings from text. You'll need the `model_id` of the model you set up in the previous section and a `field_map`, which specifies the name of the field from which to take the text (`passage`) and the name of the field in which to record embeddings (`passage_embedding`):
 
 ```json
 PUT /_ingest/pipeline/nlp-ingest-pipeline
@@ -112,7 +112,7 @@ PUT /_ingest/pipeline/nlp-ingest-pipeline
       "text_embedding": {
         "model_id": "aVeif4oB5Vm0Tdw8zYO2",
         "field_map": {
-          "text": "passage_embedding"
+          "passage": "passage_embedding"
         }
       }
     }
@@ -123,7 +123,7 @@ PUT /_ingest/pipeline/nlp-ingest-pipeline
 
 ### Step 3: Create a vector index
 
-Now you'll create a vector index by setting `index.knn` to `true`. In the index, the field named `text` contains an image description, and a [`knn_vector`]({{site.url}}{{site.baseurl}}/mappings/supported-field-types/knn-vector/) field named `passage_embedding` contains the vector embedding of the text. The vector field `dimension` must match the dimensionality of the model you configured in Step 2. Additionally, set the default ingest pipeline to the `nlp-ingest-pipeline` you created in the previous step:
+Now you'll create a vector index by setting `index.knn` to `true`. In the index, the field named `passage` contains an image description, and a [`knn_vector`]({{site.url}}{{site.baseurl}}/mappings/supported-field-types/knn-vector/) field named `passage_embedding` contains the vector embedding of the text. The vector field `dimension` must match the dimensionality of the model you configured in Step 2. Additionally, set the default ingest pipeline to the `nlp-ingest-pipeline` you created in the previous step:
 
 
 ```json
@@ -140,7 +140,7 @@ PUT /my-nlp-index
         "dimension": 768,
         "space_type": "l2"
       },
-      "text": {
+      "passage": {
         "type": "text"
       }
     }
@@ -153,28 +153,28 @@ Setting up a vector index allows you to later perform a vector search on the `pa
 
 ### Step 4: Ingest documents into the index
 
-In this step, you'll ingest several sample documents into the index. The sample data is taken from the [Flickr image dataset](https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset). Each document contains a `text` field corresponding to the image description and an `id` field corresponding to the image ID:
+In this step, you'll ingest several sample documents into the index. The sample data is taken from the [Flickr image dataset](https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset). Each document contains a `passage` field corresponding to the image description and an `id` field corresponding to the image ID:
 
 ```json
 PUT /my-nlp-index/_doc/1
 {
-  "text": "A man who is riding a wild horse in the rodeo is very near to falling off ."
+  "passage": "A man who is riding a wild horse in the rodeo is very near to falling off ."
 }
 ```
 {% include copy-curl.html %}
 
 ```json
 PUT /my-nlp-index/_doc/2
 {
-  "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."
+  "passage": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."
 }
 ```
 {% include copy-curl.html %}
 
 ```json
 PUT /my-nlp-index/_doc/3
 {
-  "text": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."
+  "passage": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."
 }
 ```
 {% include copy-curl.html %}
@@ -228,23 +228,23 @@ The response contains the matching documents:
         "_id": "1",
         "_score": 0.015851952,
         "_source": {
-          "text": "A man who is riding a wild horse in the rodeo is very near to falling off ."
+          "passage": "A man who is riding a wild horse in the rodeo is very near to falling off ."
         }
       },
       {
         "_index": "my-nlp-index",
         "_id": "2",
         "_score": 0.015177963,
         "_source": {
-          "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."
+          "passage": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."
         }
       },
       {
         "_index": "my-nlp-index",
         "_id": "3",
         "_score": 0.011347729,
         "_source": {
-          "text": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."
+          "passage": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."
         }
       }
     ]
@@ -264,14 +264,14 @@ To register and deploy a model, select the built-in workflow template for the mo
 
 ### Step 2: Configure a workflow
 
-Create and provision a semantic search workflow. You must provide the model ID for the model deployed in the previous step. Review your selected workflow template [defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) to determine whether you need to update any of the parameters. For example, if the model dimensionality is different from the default (`1024`), specify the dimensionality of your model in the `output_dimension` parameter. Change the workflow template default text field from `passage_text` to `text` in order to match the manual example:
+Create and provision a semantic search workflow. You must provide the model ID for the model deployed in the previous step. Review your selected workflow template [defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) to determine whether you need to update any of the parameters. For example, if the model dimensionality is different from the default (`1024`), specify the dimensionality of your model in the `output_dimension` parameter. Change the workflow template default text field from `passage_text` to `passage` in order to match the manual example:
 
 ```json
 POST /_plugins/_flow_framework/workflow?use_case=semantic_search&provision=true
 {
     "create_ingest_pipeline.model_id" : "aVeif4oB5Vm0Tdw8zYO2",
     "text_embedding.field_map.output.dimension": "768",
-    "text_embedding.field_map.input": "text"
+    "text_embedding.field_map.input": "passage"
 }
 ```
 {% include copy-curl.html %}

Original file line number	Diff line number	Diff line change
`@@ -101,7 +101,7 @@ You'll need the model ID in order to use this model for several of the following`
`101`	`101`
`102`	`102`	`### Step 2: Create an ingest pipeline`
`103`	`103`
`104`		-First, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains one processor: a task that transforms document fields before documents are ingested into an index. You'll set up a `text_embedding` processor that creates vector embeddings from text. You'll need the `model_id` of the model you set up in the previous section and a `field_map`, which specifies the name of the field from which to take the text (`text`) and the name of the field in which to record embeddings (`passage_embedding`):
	`104`	+First, you need to create an [ingest pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/index/) that contains one processor: a task that transforms document fields before documents are ingested into an index. You'll set up a `text_embedding` processor that creates vector embeddings from text. You'll need the `model_id` of the model you set up in the previous section and a `field_map`, which specifies the name of the field from which to take the text (`passage`) and the name of the field in which to record embeddings (`passage_embedding`):
`105`	`105`
`106`	`106`	```json
`107`	`107`	`PUT /_ingest/pipeline/nlp-ingest-pipeline`
`@@ -112,7 +112,7 @@ PUT /_ingest/pipeline/nlp-ingest-pipeline`
`112`	`112`	`"text_embedding": {`
`113`	`113`	`"model_id": "aVeif4oB5Vm0Tdw8zYO2",`
`114`	`114`	`"field_map": {`
`115`		`- "text": "passage_embedding"`
	`115`	`+ "passage": "passage_embedding"`
`116`	`116`	`}`
`117`	`117`	`}`
`118`	`118`	`}`
`@@ -123,7 +123,7 @@ PUT /_ingest/pipeline/nlp-ingest-pipeline`
`123`	`123`
`124`	`124`	`### Step 3: Create a vector index`
`125`	`125`
`126`		-Now you'll create a vector index by setting `index.knn` to `true`. In the index, the field named `text` contains an image description, and a [`knn_vector`]({{site.url}}{{site.baseurl}}/mappings/supported-field-types/knn-vector/) field named `passage_embedding` contains the vector embedding of the text. The vector field `dimension` must match the dimensionality of the model you configured in Step 2. Additionally, set the default ingest pipeline to the `nlp-ingest-pipeline` you created in the previous step:
	`126`	+Now you'll create a vector index by setting `index.knn` to `true`. In the index, the field named `passage` contains an image description, and a [`knn_vector`]({{site.url}}{{site.baseurl}}/mappings/supported-field-types/knn-vector/) field named `passage_embedding` contains the vector embedding of the text. The vector field `dimension` must match the dimensionality of the model you configured in Step 2. Additionally, set the default ingest pipeline to the `nlp-ingest-pipeline` you created in the previous step:
`127`	`127`
`128`	`128`
`129`	`129`	```json
`@@ -140,7 +140,7 @@ PUT /my-nlp-index`
`140`	`140`	`"dimension": 768,`
`141`	`141`	`"space_type": "l2"`
`142`	`142`	`},`
`143`		`- "text": {`
	`143`	`+ "passage": {`
`144`	`144`	`"type": "text"`
`145`	`145`	`}`
`146`	`146`	`}`
@@ -153,28 +153,28 @@ Setting up a vector index allows you to later perform a vector search on the `pa
`153`	`153`
`154`	`154`	`### Step 4: Ingest documents into the index`
`155`	`155`
`156`		-In this step, you'll ingest several sample documents into the index. The sample data is taken from the [Flickr image dataset](https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset). Each document contains a `text` field corresponding to the image description and an `id` field corresponding to the image ID:
	`156`	+In this step, you'll ingest several sample documents into the index. The sample data is taken from the [Flickr image dataset](https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset). Each document contains a `passage` field corresponding to the image description and an `id` field corresponding to the image ID:
`157`	`157`
`158`	`158`	```json
`159`	`159`	`PUT /my-nlp-index/_doc/1`
`160`	`160`	`{`
`161`		`- "text": "A man who is riding a wild horse in the rodeo is very near to falling off ."`
	`161`	`+ "passage": "A man who is riding a wild horse in the rodeo is very near to falling off ."`
`162`	`162`	`}`
`163`	`163`	```
`164`	`164`	`{% include copy-curl.html %}`
`165`	`165`
`166`	`166`	```json
`167`	`167`	`PUT /my-nlp-index/_doc/2`
`168`	`168`	`{`
`169`		`- "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."`
	`169`	`+ "passage": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."`
`170`	`170`	`}`
`171`	`171`	```
`172`	`172`	`{% include copy-curl.html %}`
`173`	`173`
`174`	`174`	```json
`175`	`175`	`PUT /my-nlp-index/_doc/3`
`176`	`176`	`{`
`177`		`- "text": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."`
	`177`	`+ "passage": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."`
`178`	`178`	`}`
`179`	`179`	```
`180`	`180`	`{% include copy-curl.html %}`
`@@ -228,23 +228,23 @@ The response contains the matching documents:`
`228`	`228`	`"_id": "1",`
`229`	`229`	`"_score": 0.015851952,`
`230`	`230`	`"_source": {`
`231`		`- "text": "A man who is riding a wild horse in the rodeo is very near to falling off ."`
	`231`	`+ "passage": "A man who is riding a wild horse in the rodeo is very near to falling off ."`
`232`	`232`	`}`
`233`	`233`	`},`
`234`	`234`	`{`
`235`	`235`	`"_index": "my-nlp-index",`
`236`	`236`	`"_id": "2",`
`237`	`237`	`"_score": 0.015177963,`
`238`	`238`	`"_source": {`
`239`		`- "text": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."`
	`239`	`+ "passage": "A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."`
`240`	`240`	`}`
`241`	`241`	`},`
`242`	`242`	`{`
`243`	`243`	`"_index": "my-nlp-index",`
`244`	`244`	`"_id": "3",`
`245`	`245`	`"_score": 0.011347729,`
`246`	`246`	`"_source": {`
`247`		`- "text": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."`
	`247`	`+ "passage": "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."`
`248`	`248`	`}`
`249`	`249`	`}`
`250`	`250`	`]`
`@@ -264,14 +264,14 @@ To register and deploy a model, select the built-in workflow template for the mo`
`264`	`264`
`265`	`265`	`### Step 2: Configure a workflow`
`266`	`266`
`267`		-Create and provision a semantic search workflow. You must provide the model ID for the model deployed in the previous step. Review your selected workflow template [defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) to determine whether you need to update any of the parameters. For example, if the model dimensionality is different from the default (`1024`), specify the dimensionality of your model in the `output_dimension` parameter. Change the workflow template default text field from `passage_text` to `text` in order to match the manual example:
	`267`	+Create and provision a semantic search workflow. You must provide the model ID for the model deployed in the previous step. Review your selected workflow template [defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) to determine whether you need to update any of the parameters. For example, if the model dimensionality is different from the default (`1024`), specify the dimensionality of your model in the `output_dimension` parameter. Change the workflow template default text field from `passage_text` to `passage` in order to match the manual example:
`268`	`268`
`269`	`269`	```json
`270`	`270`	`POST /_plugins/_flow_framework/workflow?use_case=semantic_search&provision=true`
`271`	`271`	`{`
`272`	`272`	`"create_ingest_pipeline.model_id" : "aVeif4oB5Vm0Tdw8zYO2",`
`273`	`273`	`"text_embedding.field_map.output.dimension": "768",`
`274`		`- "text_embedding.field_map.input": "text"`
	`274`	`+ "text_embedding.field_map.input": "passage"`
`275`	`275`	`}`
`276`	`276`	```
`277`	`277`	`{% include copy-curl.html %}`