You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-howto-create-indexers.md
+9-7Lines changed: 9 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,17 +18,17 @@ A search indexer connects to an external data source, retrieves and processes da
18
18
19
19
+ Text-based indexing, extracting strings and metadata for full text search scenarios.
20
20
21
-
+ AI-enriched indexing, applying integrated machine learning and AI models to analyze content that isn't otherwise searchable, such as images and large undifferentiated text.
21
+
+[AI-enriched indexing](cognitive-search-concept-intro.md), applying integrated machine learning and AI models to analyze content that isn't otherwise searchable, such as images and large undifferentiated text.
22
22
23
23
Using indexers significantly reduces the quantity and complexity of the code you need to write. This article focuses on the basics of creating an indexer. Depending on the data source and your workflow, more configuration might be necessary.
24
24
25
25
## Indexer definitions
26
26
27
-
When you create an indexer, the definition will adhere to one of two patterns: text-based indexing or AI enrichment with skills.
27
+
When you create an indexer, the definition will adhere to one of two patterns: text-based indexing or AI enrichment with skills. The only difference is that an indexer that invokes AI enrichment has more definitions.
28
28
29
29
### Indexer definition for full text search
30
30
31
-
Full text search is the primary use case for indexers, and for this workflow, an indexer composition will be similar to the following example.
31
+
Full text search is the primary use case for indexers, and for this workflow, an indexer will look like this example.
32
32
33
33
```json
34
34
{
@@ -86,12 +86,14 @@ Indexers also drive [AI enrichment](cognitive-search-concept-intro.md). All of t
86
86
}
87
87
```
88
88
89
-
AI enrichment is out of scope for this article. For more information, start with [Skillsets in Azure Cognitive Search](cognitive-search-working-with-skillsets.md), [Create a skillset](cognitive-search-defining-skillset.md), [Map enrichment output fields](cognitive-search-output-field-mapping.md), and [Enable caching for AI enrichment](search-howto-incremental-index.md).
89
+
AI enrichment is out of scope for this article. For more information, start with [AI enrichment](cognitive-search-concept-intro.md), [Skillsets in Azure Cognitive Search](cognitive-search-working-with-skillsets.md), [Create a skillset](cognitive-search-defining-skillset.md), [Map enrichment output fields](cognitive-search-output-field-mapping.md), and [Enable caching for AI enrichment](search-howto-incremental-index.md).
90
90
91
91
## Prerequisites
92
92
93
93
+ Identify a [supported data source](search-indexer-overview.md#supported-data-sources) that contains the content you want to ingest.
94
94
95
+
+[Create an indexer data source](#prepare-a-data-source) that sets up a connection to external data.
96
+
95
97
+[Create a search index](search-how-to-create-search-index.md) that can accept incoming data.
96
98
97
99
+ Be under the [maximum limits](search-limits-quotas-capacity.md#indexer-limits) for your service tier. The Free tier allows three objects of each type and 1-3 minutes of indexer processing or 3-10 if there's a skillset.
@@ -103,7 +105,7 @@ Indexers work with data sets. When you run an indexer, it connects to your data
103
105
| Source data | Tasks |
104
106
|-------------|-------|
105
107
| JSON documents | Make sure the structure or shape of incoming data corresponds to the schema of your search index. Most search indexes are fairly flat, where the fields collection consists of fields at the same level. However, hierarchical or nested structures are possible through [complex fields and collections](search-howto-complex-data-types.md). |
106
-
| Relational | You'll need to provide it as a flattened row set, where each row becomes a full or partial search document in the index. </p>To flatten relational data into a row set, you should create a SQL view, or build a query that returns parent and child records in the same row. For example, the built-in hotels sample dataset is an SQL database that has 50 records (one for each hotel), linked to room records in a related table. The query that flattens the collective data into a row set embeds all of the room information in JSON documents in each hotel record. The embedded room information is a generated by a query that uses a **FOR JSON AUTO** clause. </p> You can learn more about this technique in [define a query that returns embedded JSON](index-sql-relational-data.md#define-a-query-that-returns-embedded-json). This is just one example; you can find other approaches that will produce the same result. |
108
+
| Relational | You'll need to provide it as a flattened row set, where each row becomes a full or partial search document in the index. </p>To flatten relational data into a row set, you should create a SQL view, or build a query that returns parent and child records in the same row. For example, the built-in hotels sample dataset is an SQL database that has 50 records (one for each hotel), linked to room records in a related table. The query that flattens the collective data into a row set embeds all of the room information in JSON documents in each hotel record. The embedded room information is a generated by a query that uses a **FOR JSON AUTO** clause. </p> You can learn more about this technique in [define a query that returns embedded JSON](index-sql-relational-data.md#define-a-query-that-returns-embedded-json). This is just one example; you can find other approaches that will produce the same result. |
107
109
| Files | An indexer generally creates one search document for each file, where the search document consists of fields for content and metadata. Depending on the file type, the indexer can sometimes [parse one file into multiple search documents](search-howto-index-one-to-many-blobs.md). For example, in a CSV file, each row can become a standalone search document. |
108
110
109
111
Remember that you'll only need to pull in searchable and filterable data:
@@ -117,11 +119,11 @@ Given that indexers don't fix data problems, other forms of data cleansing or ma
117
119
118
120
## Prepare a data source
119
121
120
-
Indexers require a data source that specifies the type, location, and connection information.
122
+
Indexers require a data source that specifies the type, container, and connection.
121
123
122
124
1. Make sure you're using a [supported data source type](search-indexer-overview.md#supported-data-sources).
123
125
124
-
1.[Create a data source](/rest/api/searchservice/create-data-source). The following list is a few of the more frequently used data sources:
126
+
1.[Create a data source](/rest/api/searchservice/create-data-source) definition. The following list is a few of the more frequently used data sources:
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-cosmosdb-gremlin.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,7 +33,9 @@ Although Cosmos DB indexing is easiest with the [Import data wizard](search-impo
33
33
34
34
+ An [automatic indexing policy](../cosmos-db/index-policy.md) on the Cosmos DB collection, set to [Consistent](../cosmos-db/index-policy.md#indexing-mode). This is the default configuration. Lazy indexing isn't recommended and may result in missing data.
35
35
36
-
Unfamiliar with indexers? Start with [**Create an indexer**](search-howto-create-indexers.md) for more background.
36
+
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Cosmos DB Account Reader Role** permissions.
37
+
38
+
Unfamiliar with indexers? See [**Create an indexer**](search-howto-create-indexers.md) before you get started.
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-cosmosdb-mongodb.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,7 +29,9 @@ Although Cosmos DB indexing is easiest with the [Import data wizard](search-impo
29
29
30
30
+ An [automatic indexing policy](../cosmos-db/index-policy.md) on the Cosmos DB collection, set to [Consistent](../cosmos-db/index-policy.md#indexing-mode). This is the default configuration. Lazy indexing isn't recommended and may result in missing data.
31
31
32
-
Unfamiliar with indexers? Start with [**Create an indexer**](search-howto-create-indexers.md) for more background.
32
+
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Cosmos DB Account Reader Role** permissions.
33
+
34
+
Unfamiliar with indexers? See [**Create an indexer**](search-howto-create-indexers.md) before you get started.
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-cosmosdb.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,9 @@ Although Cosmos DB indexing is easiest with the [Import data wizard](search-impo
24
24
25
25
+ An [automatic indexing policy](../cosmos-db/index-policy.md) on the Cosmos DB collection, set to [Consistent](../cosmos-db/index-policy.md#indexing-mode). This is the default configuration. Lazy indexing isn't recommended and may result in missing data.
26
26
27
-
Unfamiliar with indexers? Start with [**Create an indexer**](search-howto-create-indexers.md) for more background.
27
+
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Cosmos DB Account Reader Role** permissions.
28
+
29
+
Unfamiliar with indexers? See [**Create an indexer**](search-howto-create-indexers.md) before you get started.
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-mysql.md
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,11 @@ This article supplements [**Create an indexer**](search-howto-create-indexers.md
30
30
31
31
+ A table or view that provides the content. A primary key is required. If you're using a view, it must have a [high water mark column](#DataChangeDetectionPolicy).
32
32
33
-
+ A REST client, such as [Postman](search-get-started-rest.md) or [Visual Studio Code with the extension for Azure Cognitive Search](search-get-started-vs-code.md) to send REST calls that create the data source, index, and indexer. You can also use the [Azure SDK for .NET](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourcetype.mysql).
33
+
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Reader** permissions on MySQL.
34
+
35
+
+ A REST client, such as [Postman](search-get-started-rest.md) or [Visual Studio Code with the extension for Azure Cognitive Search](search-get-started-vs-code.md) to send REST calls that create the data source, index, and indexer.
36
+
37
+
You can also use the [Azure SDK for .NET](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourcetype.mysql). You can't use the portal for indexer creation, but you can manage indexers and data sources once they're created.
Copy file name to clipboardExpand all lines: articles/search/search-howto-managed-identities-cosmos-db.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,7 @@ Create the data source and provide either a system-assigned managed identity or
39
39
40
40
### System-assigned managed identity
41
41
42
-
The [REST API](/rest/api/searchservice/create-data-source), Azure portal, and the [.NET SDK](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourcetype) support using a system-assigned managed identity.
42
+
The [REST API](/rest/api/searchservice/create-data-source), Azure portal, and the [.NET SDK](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourceconnection) support using a system-assigned managed identity.
43
43
44
44
When you're connecting with a system-assigned managed identity, the only change to the data source definition is the format of the "credentials" property. You'll provide the database name and a ResourceId that has no account key or password. The ResourceId must include the subscription ID of Cosmos DB, the resource group, and the Cosmos DB account name.
Copy file name to clipboardExpand all lines: articles/search/search-howto-managed-identities-data-sources.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -102,12 +102,12 @@ See [Create a search service with a system assigned managed identity (Azure CLI)
102
102
103
103
## Create a user managed identity (preview)
104
104
105
-
If you don't already have a user-assigned managed identity, you'll need to create one. A user-assigned managed identity is a resource on Azure.
105
+
A user-assigned managed identity is a resource on Azure. It's useful if you need more granularity in role assignments.
106
106
107
-
A user-assigned managed identity is useful if you need more precision in role assignments. You can create separate identifies for different applications and scenarios that are related to indexer-based indexing.
107
+
Currently in Azure Cognitive Search, user managed identities are supported only for indexer data connections. You can create separate identities for different applications and scenarios that are related to indexer-based indexing.
108
108
109
109
> [!IMPORTANT]
110
-
>This feature is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [Management REST API 2021-04-01-Preview](/rest/api/searchmanagement/2021-04-01-preview/services/create-or-update#searchcreateorupdateservicewithidentity) provides this feature.
110
+
>This feature is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
0 commit comments