MicrosoftDocs
diff --git a/‎articles/search/cognitive-search-concept-annotations-syntax.md
Lines changed: 10 additions & 10 deletions b/‎articles/search/cognitive-search-concept-annotations-syntax.md
Lines changed: 10 additions & 10 deletions
diff --git a/‎articles/search/cognitive-search-skill-document-extraction.md
Lines changed: 11 additions & 3 deletions b/‎articles/search/cognitive-search-skill-document-extraction.md
Lines changed: 11 additions & 3 deletions
@@ -8,24 +8,24 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: how-to
-ms.date: 12/10/2024
+ms.date: 05/27/2025
 ---
 
 # Reference a path to enriched nodes using context and source properties an Azure AI Search skillset
 
 During skillset execution, the engine builds an in-memory [enrichment tree](cognitive-search-working-with-skillsets.md#enrichment-tree) that captures each enrichment, such as recognized entities or translated text. In this article, learn how to reference an enrichment node in the enrichment tree so that you can pass output to downstream skills or specify an output field mapping for a search index field. 
 
-This article uses examples to illustrate various scenarios. For the full syntax, see [Skill context and input annotation language language](cognitive-search-skill-annotation-language.md).
+This article uses examples to illustrate various scenarios. For the full syntax, see [Skill context and input annotation language](cognitive-search-skill-annotation-language.md).
 
 ## Background concepts
 
 Before reviewing the syntax, let's revisit a few important concepts to better understand the examples provided later in this article.
 
 | Term | Description |
 |------|-------------|
-| "enriched document" | An enriched document is an in-memory structure that collects skill output as it's created and it holds all enrichments related to a document. Think of an enriched document as a tree. Generally, the tree starts at the root document level, and each new enrichment is created from a previous as its child.  |
-| "node" | Within an enriched document, a node (sometimes referred to as an "annotation") is created and populated by a skill, such as "text" and "layoutText" in the OCR skill. An enriched document is populated with both enrichments and original source field values or metadata copied from the source. |
-| "context" | The scope of enrichment, which is either the entire document, a portion of a document, or if you're working with images, the extracted images from a document. By default, the enrichment context is at the `"/document"` level, scoped to individual documents contained in the data source. When a skill runs, the outputs of that skill become [properties of the defined context](#example-2). |
+| "enriched document" | An enriched document is an in-memory structure that collects skill output as it's created and it holds all enrichments related to a document. Think of an enriched document as a tree. Generally, the tree starts at the root document level, and each new enrichment is created from a previous node as its child.  |
+| "node" | Within an enriched document, a node (sometimes referred to as an "annotation") is specific output such as the "text" or "layoutText" of the OCR skill, or an original source field value such as the content of a product ID field, or metadata copied from the source such as metadata_storage_path from blobs in Azure Storage. |
+| "context" | The scope of enrichment, which is either the entire document, a portion of a document (pages or sentences), or if you're working with images, the extracted images from a document. By default, the enrichment context is at the `"/document"` level, scoped to individual documents contained in the data source. When a skill runs, the outputs of that skill become [properties of the defined context](#example-2). |
 
 ## Paths for different scenarios
 
@@ -37,7 +37,7 @@ The example in the screenshot illustrates the path for an item in an Azure Cosmo
 
 + `context` path is `/document/HotelId` because the collection is partitioned into documents by the `/HotelId` field.
 
-+ `source` path is `/document/Description` because the skill is a translation skill, and the field that you'll want the skill to translate is the `Description` field in each document.
++ `source` path is `/document/Description` because the skill is a translation skill, and the field that you want to translate is the `Description` field in each document.
 
 All paths start with `/document`. An enriched document is created in the "document cracking" stage of indexer execution, when the indexer opens a document or reads in a row from the data source. Initially, the only node in an enriched document is the [root node (`/document`)](cognitive-search-skill-annotation-language.md#document-root), and it's the node from which all other enrichments occur. 
 
@@ -47,7 +47,7 @@ The following list includes several common examples:
 + `/document/{key}` is the syntax for a document or item in an Azure Cosmos DB collection, where `{key}` is the actual key, such as `/document/HotelId` in the previous example.
 + `/document/content` specifies the "content" property of a JSON blob. 
 + `/document/{field}` is the syntax for an operation performed on a specific field, such as translating the `/document/Description` field, seen in the previous example.
-+ `/document/pages/*` or `/document/sentences/*` become the context if you're breaking a large document into smaller chunks for processing. If "context" is `/document/pages/*`, the skill executes once over each page in the document. Because there might be more than one page or sentence, you'll append `/*` to catch them all.
++ `/document/pages/*` or `/document/sentences/*` become the context if you're breaking a large document into smaller chunks for processing. If "context" is `/document/pages/*`, the skill executes once over each page in the document. Because there might be more than one page or sentence, you can append `/*` to catch them all.
 + `/document/normalized_images/*` is created during document cracking if the document contains images. All paths to images start with normalized_images. Since there are often multiple images embedded in a document, append `/*`.
 
 Examples in the remainder of this article are based on the "content" field generated automatically by [Azure blob indexers](search-howto-indexing-azure-blob-storage.md) as part of the [document cracking](search-indexer-overview.md#document-cracking) phase. When referring to documents from a Blob container, use a format such as `"/document/content"`, where the "content" field is part of the "document".
@@ -56,7 +56,7 @@ Examples in the remainder of this article are based on the "content" field gener
 
 ## Example 1: Simple annotation reference
 
-In Azure Blob Storage, suppose you have a variety of files containing references to people's names that you want to extract using entity recognition. In the following skill definition, `"/document/content"` is the textual representation of the entire document, and "people" is an extraction of full names for entities identified as persons.
+In Azure Blob Storage, suppose you have various files containing references to people's names that you want to extract using entity recognition. In the following skill definition, `"/document/content"` is the textual representation of the entire document, and "people" is an extraction of full names for entities identified as persons.
 
 Because the default context is `"/document"`, the list of people can now be referenced as `"/document/people"`. In this specific case `"/document/people"` is an annotation, which could now be mapped to a field in an index, or used in another skill in the same skillset.
 
@@ -110,15 +110,15 @@ To invoke the right number of iterations, set the context as `"/document/people/
   }
 ```
 
-When annotations are arrays or collections of strings, you might want to target specific members rather than the array as a whole. The above example generates an annotation called `"last"` under each node represented by the context. If you want to refer to this family of annotations, you could use the syntax `"/document/people/*/last"`. If you want to refer to a particular annotation, you could use an explicit index: `"/document/people/1/last`" to reference the last name of the first person identified in the document. Notice that in this syntax arrays are "0 indexed".
+When annotations are arrays or collections of strings, you might want to target specific members rather than the array as a whole. The previous example generates an annotation called `"last"` under each node represented by the context. If you want to refer to this family of annotations, you could use the syntax `"/document/people/*/last"`. If you want to refer to a particular annotation, you could use an explicit index: `"/document/people/1/last`" to reference the last name of the first person identified in the document. Notice that in this syntax arrays are "0 indexed".
 
 <a name="example-3"></a>
 
 ## Example 3: Reference members within an array
 
 Sometimes you need to group all annotations of a particular type to pass them to a particular skill. Consider a hypothetical custom skill that identifies the most common last name from all the last names extracted in Example 2. To provide just the last names to the custom skill, specify the context as `"/document"` and the input as `"/document/people/*/lastname"`.
 
-Notice that the cardinality of `"/document/people/*/lastname"` is larger than that of document. There may be 10 lastname nodes while there's only one document node for this document. In that case, the system will automatically create an array of  `"/document/people/*/lastname"` containing all of the elements in the document.
+Notice that the cardinality of `"/document/people/*/lastname"` is larger than that of document. There might be 10 lastname nodes while there's only one document node for this document. In that case, the system will automatically create an array of  `"/document/people/*/lastname"` containing all of the elements in the document.
 
 ```json
   {
 
@@ -10,15 +10,23 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: reference
-ms.date: 12/12/2021
+ms.date: 05/27/2025
 ---
+
 # Document Extraction cognitive skill
 
-The **Document Extraction** skill extracts content from a file within the enrichment pipeline. This allows you to take advantage of the document extraction step that normally happens before the skillset execution with files that may be generated by other skills.
+The **Document Extraction** skill extracts content from a file within the enrichment pipeline. By default, content extraction or retrieval is built into the indexer pipeline. However, by using the Document Extraction skill, you can control how parameters are set, and how extracted content is named in the enrichment tree.
+
+For [vector](vector-search-overview.md) and [multimodal search](multimodal-search-overview.md), Document Extraction combined with the [Text Split skill](cognitive-search-skill-textsplit.md) is more affordable than other [data chunking approaches](vector-search-how-to-chunk-documents.md). The following tutorials demonstrate skill usage for different scenarios:
+
++ [Tutorial: Index mixed content using multimodal embeddings and the Document Extraction skill](tutorial-multimodal-indexing-with-embedding-and-doc-extraction.md)
+
++ [Tutorial: Index mixed content using image verbalizations and the Document Extraction skill](tutorial-multimodal-indexing-with-image-verbalization-and-doc-extraction.md)
 
 > [!NOTE]
 > This skill isn't bound to Azure AI services and has no Azure AI services key requirement.
-> This skill extracts text and images. Text extraction is free. Image extraction is [metered by Azure AI Search](https://azure.microsoft.com/pricing/details/search/). On a free search service, the cost of 20 transactions per indexer per day is absorbed so that you can complete quickstarts, tutorials, and small projects at no charge. For Basic, Standard, and above, image extraction is billable.
+>
+> This skill extracts text and images. Text extraction is free. Image extraction is [billable by Azure AI Search](https://azure.microsoft.com/pricing/details/search/). On a free search service, the cost of 20 transactions per indexer per day is absorbed so that you can complete quickstarts, tutorials, and small projects at no charge. For basic and higher tiers, image extraction is billable.
 >
 
 ## @odata.type