Skip to content

Commit e40ac70

Browse files
authored
Merge pull request #188712 from HeidiSteen/heidist-fresh2
prereq consistency pass
2 parents a305eb2 + 9abd6ee commit e40ac70

7 files changed

+27
-15
lines changed

articles/search/search-howto-create-indexers.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -18,17 +18,17 @@ A search indexer connects to an external data source, retrieves and processes da
1818

1919
+ Text-based indexing, extracting strings and metadata for full text search scenarios.
2020

21-
+ AI-enriched indexing, applying integrated machine learning and AI models to analyze content that isn't otherwise searchable, such as images and large undifferentiated text.
21+
+ [AI-enriched indexing](cognitive-search-concept-intro.md), applying integrated machine learning and AI models to analyze content that isn't otherwise searchable, such as images and large undifferentiated text.
2222

2323
Using indexers significantly reduces the quantity and complexity of the code you need to write. This article focuses on the basics of creating an indexer. Depending on the data source and your workflow, more configuration might be necessary.
2424

2525
## Indexer definitions
2626

27-
When you create an indexer, the definition will adhere to one of two patterns: text-based indexing or AI enrichment with skills.
27+
When you create an indexer, the definition will adhere to one of two patterns: text-based indexing or AI enrichment with skills. The only difference is that an indexer that invokes AI enrichment has more definitions.
2828

2929
### Indexer definition for full text search
3030

31-
Full text search is the primary use case for indexers, and for this workflow, an indexer composition will be similar to the following example.
31+
Full text search is the primary use case for indexers, and for this workflow, an indexer will look like this example.
3232

3333
```json
3434
{
@@ -86,12 +86,14 @@ Indexers also drive [AI enrichment](cognitive-search-concept-intro.md). All of t
8686
}
8787
```
8888

89-
AI enrichment is out of scope for this article. For more information, start with [Skillsets in Azure Cognitive Search](cognitive-search-working-with-skillsets.md), [Create a skillset](cognitive-search-defining-skillset.md), [Map enrichment output fields](cognitive-search-output-field-mapping.md), and [Enable caching for AI enrichment](search-howto-incremental-index.md).
89+
AI enrichment is out of scope for this article. For more information, start with [AI enrichment](cognitive-search-concept-intro.md), [Skillsets in Azure Cognitive Search](cognitive-search-working-with-skillsets.md), [Create a skillset](cognitive-search-defining-skillset.md), [Map enrichment output fields](cognitive-search-output-field-mapping.md), and [Enable caching for AI enrichment](search-howto-incremental-index.md).
9090

9191
## Prerequisites
9292

9393
+ Identify a [supported data source](search-indexer-overview.md#supported-data-sources) that contains the content you want to ingest.
9494

95+
+ [Create an indexer data source](#prepare-a-data-source) that sets up a connection to external data.
96+
9597
+ [Create a search index](search-how-to-create-search-index.md) that can accept incoming data.
9698

9799
+ Be under the [maximum limits](search-limits-quotas-capacity.md#indexer-limits) for your service tier. The Free tier allows three objects of each type and 1-3 minutes of indexer processing or 3-10 if there's a skillset.
@@ -103,7 +105,7 @@ Indexers work with data sets. When you run an indexer, it connects to your data
103105
| Source data | Tasks |
104106
|-------------|-------|
105107
| JSON documents | Make sure the structure or shape of incoming data corresponds to the schema of your search index. Most search indexes are fairly flat, where the fields collection consists of fields at the same level. However, hierarchical or nested structures are possible through [complex fields and collections](search-howto-complex-data-types.md). |
106-
| Relational | You'll need to provide it as a flattened row set, where each row becomes a full or partial search document in the index. </p>To flatten relational data into a row set, you should create a SQL view, or build a query that returns parent and child records in the same row. For example, the built-in hotels sample dataset is an SQL database that has 50 records (one for each hotel), linked to room records in a related table. The query that flattens the collective data into a row set embeds all of the room information in JSON documents in each hotel record. The embedded room information is a generated by a query that uses a **FOR JSON AUTO** clause. </p> You can learn more about this technique in [define a query that returns embedded JSON](index-sql-relational-data.md#define-a-query-that-returns-embedded-json). This is just one example; you can find other approaches that will produce the same result. |
108+
| Relational | You'll need to provide it as a flattened row set, where each row becomes a full or partial search document in the index. </p> To flatten relational data into a row set, you should create a SQL view, or build a query that returns parent and child records in the same row. For example, the built-in hotels sample dataset is an SQL database that has 50 records (one for each hotel), linked to room records in a related table. The query that flattens the collective data into a row set embeds all of the room information in JSON documents in each hotel record. The embedded room information is a generated by a query that uses a **FOR JSON AUTO** clause. </p> You can learn more about this technique in [define a query that returns embedded JSON](index-sql-relational-data.md#define-a-query-that-returns-embedded-json). This is just one example; you can find other approaches that will produce the same result. |
107109
| Files | An indexer generally creates one search document for each file, where the search document consists of fields for content and metadata. Depending on the file type, the indexer can sometimes [parse one file into multiple search documents](search-howto-index-one-to-many-blobs.md). For example, in a CSV file, each row can become a standalone search document. |
108110

109111
Remember that you'll only need to pull in searchable and filterable data:
@@ -117,11 +119,11 @@ Given that indexers don't fix data problems, other forms of data cleansing or ma
117119

118120
## Prepare a data source
119121

120-
Indexers require a data source that specifies the type, location, and connection information.
122+
Indexers require a data source that specifies the type, container, and connection.
121123

122124
1. Make sure you're using a [supported data source type](search-indexer-overview.md#supported-data-sources).
123125

124-
1. [Create a data source](/rest/api/searchservice/create-data-source). The following list is a few of the more frequently used data sources:
126+
1. [Create a data source](/rest/api/searchservice/create-data-source) definition. The following list is a few of the more frequently used data sources:
125127

126128
+ [Azure Blob Storage](search-howto-indexing-azure-blob-storage.md)
127129
+ [Azure Cosmos DB](search-howto-index-cosmosdb.md)

articles/search/search-howto-index-cosmosdb-gremlin.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,9 @@ Although Cosmos DB indexing is easiest with the [Import data wizard](search-impo
3333

3434
+ An [automatic indexing policy](../cosmos-db/index-policy.md) on the Cosmos DB collection, set to [Consistent](../cosmos-db/index-policy.md#indexing-mode). This is the default configuration. Lazy indexing isn't recommended and may result in missing data.
3535

36-
Unfamiliar with indexers? Start with [**Create an indexer**](search-howto-create-indexers.md) for more background.
36+
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Cosmos DB Account Reader Role** permissions.
37+
38+
Unfamiliar with indexers? See [**Create an indexer**](search-howto-create-indexers.md) before you get started.
3739

3840
## Define the data source
3941

articles/search/search-howto-index-cosmosdb-mongodb.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,9 @@ Although Cosmos DB indexing is easiest with the [Import data wizard](search-impo
2929

3030
+ An [automatic indexing policy](../cosmos-db/index-policy.md) on the Cosmos DB collection, set to [Consistent](../cosmos-db/index-policy.md#indexing-mode). This is the default configuration. Lazy indexing isn't recommended and may result in missing data.
3131

32-
Unfamiliar with indexers? Start with [**Create an indexer**](search-howto-create-indexers.md) for more background.
32+
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Cosmos DB Account Reader Role** permissions.
33+
34+
Unfamiliar with indexers? See [**Create an indexer**](search-howto-create-indexers.md) before you get started.
3335

3436
## Define the data source
3537

articles/search/search-howto-index-cosmosdb.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,9 @@ Although Cosmos DB indexing is easiest with the [Import data wizard](search-impo
2424

2525
+ An [automatic indexing policy](../cosmos-db/index-policy.md) on the Cosmos DB collection, set to [Consistent](../cosmos-db/index-policy.md#indexing-mode). This is the default configuration. Lazy indexing isn't recommended and may result in missing data.
2626

27-
Unfamiliar with indexers? Start with [**Create an indexer**](search-howto-create-indexers.md) for more background.
27+
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Cosmos DB Account Reader Role** permissions.
28+
29+
Unfamiliar with indexers? See [**Create an indexer**](search-howto-create-indexers.md) before you get started.
2830

2931
## Define the data source
3032

articles/search/search-howto-index-mysql.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,11 @@ This article supplements [**Create an indexer**](search-howto-create-indexers.md
3030

3131
+ A table or view that provides the content. A primary key is required. If you're using a view, it must have a [high water mark column](#DataChangeDetectionPolicy).
3232

33-
+ A REST client, such as [Postman](search-get-started-rest.md) or [Visual Studio Code with the extension for Azure Cognitive Search](search-get-started-vs-code.md) to send REST calls that create the data source, index, and indexer. You can also use the [Azure SDK for .NET](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourcetype.mysql).
33+
+ Read permissions. A "full access" connection string includes a key that grants access to the content, but if you're using Azure roles, make sure the [search service managed identity](search-howto-managed-identities-data-sources.md) has **Reader** permissions on MySQL.
34+
35+
+ A REST client, such as [Postman](search-get-started-rest.md) or [Visual Studio Code with the extension for Azure Cognitive Search](search-get-started-vs-code.md) to send REST calls that create the data source, index, and indexer.
36+
37+
You can also use the [Azure SDK for .NET](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourcetype.mysql). You can't use the portal for indexer creation, but you can manage indexers and data sources once they're created.
3438

3539
## Preview limitations
3640

articles/search/search-howto-managed-identities-cosmos-db.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Create the data source and provide either a system-assigned managed identity or
3939

4040
### System-assigned managed identity
4141

42-
The [REST API](/rest/api/searchservice/create-data-source), Azure portal, and the [.NET SDK](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourcetype) support using a system-assigned managed identity.
42+
The [REST API](/rest/api/searchservice/create-data-source), Azure portal, and the [.NET SDK](/dotnet/api/azure.search.documents.indexes.models.searchindexerdatasourceconnection) support using a system-assigned managed identity.
4343

4444
When you're connecting with a system-assigned managed identity, the only change to the data source definition is the format of the "credentials" property. You'll provide the database name and a ResourceId that has no account key or password. The ResourceId must include the subscription ID of Cosmos DB, the resource group, and the Cosmos DB account name.
4545

articles/search/search-howto-managed-identities-data-sources.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -102,12 +102,12 @@ See [Create a search service with a system assigned managed identity (Azure CLI)
102102
103103
## Create a user managed identity (preview)
104104
105-
If you don't already have a user-assigned managed identity, you'll need to create one. A user-assigned managed identity is a resource on Azure.
105+
A user-assigned managed identity is a resource on Azure. It's useful if you need more granularity in role assignments.
106106
107-
A user-assigned managed identity is useful if you need more precision in role assignments. You can create separate identifies for different applications and scenarios that are related to indexer-based indexing.
107+
Currently in Azure Cognitive Search, user managed identities are supported only for indexer data connections. You can create separate identities for different applications and scenarios that are related to indexer-based indexing.
108108
109109
> [!IMPORTANT]
110-
>This feature is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/). The [Management REST API 2021-04-01-Preview](/rest/api/searchmanagement/2021-04-01-preview/services/create-or-update#searchcreateorupdateservicewithidentity) provides this feature.
110+
>This feature is in public preview under [supplemental terms of use](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
111111
112112
### [**Azure portal**](#tab/portal-user)
113113

0 commit comments

Comments
 (0)