You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-indexer-overview.md
+7-2Lines changed: 7 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ ms.service: azure-ai-search
10
10
ms.custom:
11
11
- ignite-2023
12
12
ms.topic: conceptual
13
-
ms.date: 12/19/2024
13
+
ms.date: 04/09/2025
14
14
---
15
15
16
16
# Indexers in Azure AI Search
@@ -81,7 +81,9 @@ For each document it receives, an indexer implements or coordinates multiple ste
81
81
82
82
### Stage 1: Document cracking
83
83
84
-
Document cracking is the process of opening files and extracting content. Text-based content can be extracted from files on a service, rows in a table, or items in container or collection. If you add a skillset and [image skills](cognitive-search-concept-image-scenarios.md), document cracking can also extract images and queue them for image processing.
84
+
Document cracking is the process of opening files and extracting content. Text-based content can be extracted from files on a service, rows in a table, or items in container or collection.
85
+
86
+
You can also enable image extraction during document cracking for an [extra fee](https://azure.microsoft.com/en-us/pricing/details/search/). This is disabled by default and can be enabled via the `imageAction` property in the [indexer parameters configuration](/rest/api/searchservice/indexers/create-or-update).
85
87
86
88
Depending on the data source, the indexer will try different operations to extract potentially indexable content:
87
89
@@ -91,6 +93,9 @@ Depending on the data source, the indexer will try different operations to extra
91
93
92
94
+ When the document is a record in [Azure Cosmos DB](search-howto-index-cosmosdb.md), the indexer will extract non-binary content from fields and subfields from the Azure Cosmos DB document.
93
95
96
+
Note that the document cracking process can also be triggered later during the optional [skillset execution](cognitive-search-concept-intro.md) stage, using skillsets, for data transformation. Adding a skillset with [image skills](cognitive-search-concept-image-scenarios.md) allows document cracking to extract images and queue them for processing.
97
+
98
+
94
99
### Stage 2: Field mappings
95
100
96
101
An indexer extracts text from a source field and sends it to a destination field in an index or knowledge store. When field names and data types coincide, the path is clear. However, you might want different names or types in the output, in which case you need to tell the indexer how to map the field.
0 commit comments