Skip to content

Commit 6718948

Browse files
authored
Merge pull request #174262 from HeidiSteen/heidist-fresh
[azure search] cross-link to ADLS Gen2 sample
2 parents 30a44e5 + 010f3ae commit 6718948

File tree

3 files changed

+55
-32
lines changed

3 files changed

+55
-32
lines changed

articles/search/samples-dotnet.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 01/27/2021
11+
ms.date: 10/01/2021
1212
---
1313

1414
# .NET (C#) code samples for Azure Cognitive Search
@@ -62,6 +62,7 @@ The following samples are also published by the Cognitive Search team, but are n
6262

6363
| Samples | Description |
6464
|---------|-------------|
65+
| [Index Data Lake Gen2 using Azure AD](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md) | Source code demonstrating indexer connections and indexing of Azure Data Lake Gen2 files and folders that are secured through Azure AD and role-based access controls. |
6566
| [azure-search-power-skills](https://github.com/Azure-Samples/azure-search-power-skills) | Source code for consumable custom skills that you can incorporate in your won solutions. |
6667
| [Knowledge Mining Solution Accelerator](/samples/azure-samples/azure-search-knowledge-mining/azure-search-knowledge-mining/) | Includes templates, support files, and analytical reports to help you prototype an end-to-end knowledge mining solution. |
6768
| [Covid-19 Search App repository](https://github.com/liamca/covid19search) | Source code repository for the Cognitive Search based [Covid-19 Search App](https://covid19search.azurewebsites.net/) |

articles/search/search-howto-index-azure-data-lake-storage.md

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: markheff
77
ms.author: maheff
88
ms.service: cognitive-search
99
ms.topic: conceptual
10-
ms.date: 05/17/2021
10+
ms.date: 10/01/2021
1111
---
1212

1313
# Index data from Azure Data Lake Storage Gen2
@@ -16,13 +16,19 @@ This article shows you how to configure an Azure Data Lake Storage Gen2 indexer
1616

1717
Azure Data Lake Storage Gen2 is available through Azure Storage. When setting up an Azure storage account, you have the option to enable [hierarchical namespace](../storage/blobs/data-lake-storage-namespace.md). This allows the collection of content in an account to be organized into a hierarchy of directories and nested subdirectories. By enabling hierarchical namespace, you enable [Azure Data Lake Storage Gen2](../storage/blobs/data-lake-storage-introduction.md).
1818

19+
Examples in this article use the portal and REST APIs. For examples in C#, see [Index Data Lake Gen2 using Azure AD](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md) on GitHub.
20+
1921
## Supported access tiers
2022

2123
Data Lake Storage Gen2 [access tiers](../storage/blobs/access-tiers-overview.md) include hot, cool, and archive. Only hot and cool can be accessed by indexers.
2224

2325
## Access control
2426

25-
Data Lake Storage Gen2 implements an [access control model](../storage/blobs/data-lake-storage-access-control.md) that supports both Azure role-based access control (Azure RBAC) and POSIX-like access control lists (ACLs). When indexing content from Data Lake Storage Gen2, Azure Cognitive Search will not extract the Azure RBAC and ACL information from the content. As a result, this information will not be included in your Azure Cognitive Search index.
27+
Data Lake Storage Gen2 implements an [access control model](../storage/blobs/data-lake-storage-access-control.md) that supports both Azure role-based access control (Azure RBAC) and POSIX-like access control lists (ACLs). Access control lists are partially supported in Azure Cognitive Search scenarios:
28+
29+
+ Support for access control is enabled on indexer access to content in Data Lake Storage Gen2. For a search service that has a system or user-assigned managed identity, you can define role assignments that determine indexer access to specific files and folders in Azure Storage.
30+
31+
+ Support for document-level permissions on an index is not available. If your access controls vary the level of access on a per user basis, those permissions cannot be carried forward into a search index on your search service. All users have the same level of access to all searchable and retrievable content in the index.
2632

2733
If maintaining access control on each document in the index is important, it is up to the application developer to implement [security trimming](./search-security-trimming-for-azure-search.md).
2834

@@ -34,11 +40,11 @@ The Azure Cognitive Search blob indexer can extract text from the following docu
3440

3541
[!INCLUDE [search-blob-data-sources](../../includes/search-blob-data-sources.md)]
3642

37-
## Getting started with the Azure portal
43+
## Indexing through the Azure portal
3844

3945
The Azure portal supports importing data from Azure Data Lake Storage Gen2. To import data from Data Lake Storage Gen2, navigate to your Azure Cognitive Search service page in the Azure portal, select **Import data**, select **Azure Data Lake Storage Gen2**, then continue to follow the Import data flow to create your data source, skillset, index, and indexer.
4046

41-
## Getting started with the REST API
47+
## Indexing with the REST API
4248

4349
The Data Lake Storage Gen2 indexer is supported by the REST API. Follow the instructions below to set up a data source, index, and indexer.
4450

@@ -93,7 +99,7 @@ The SAS should have the list and read permissions on the container. For more inf
9399
94100
### Step 2 - Create an index
95101

96-
The index specifies the fields in a document, attributes, and other constructs that shape the search experience. All indexers require that you specify a search index definition as the destination. The following example creates a simple index using the [Create Index (REST API)](/rest/api/searchservice/create-index).
102+
The index specifies the fields in a document, attributes, and other constructs that shape the search experience. All indexers require that you specify a search index definition as the destination. The following example uses the [Create Index (REST API)](/rest/api/searchservice/create-index).
97103

98104
```http
99105
POST https://[service name].search.windows.net/indexes?api-version=2020-06-30
@@ -117,7 +123,7 @@ You could also add fields for any blob metadata that you want in the index. The
117123

118124
### Step 3 - Configure and run the indexer
119125

120-
Once the index and data source have been created, you're ready to create the indexer:
126+
Once the index and data source have been created, you're ready to [create the indexer](/rest/api/searchservice/create-indexer):
121127

122128
```http
123129
POST https://[service name].search.windows.net/indexers?api-version=2020-06-30
@@ -134,11 +140,7 @@ Once the index and data source have been created, you're ready to create the ind
134140
}
135141
```
136142

137-
This indexer runs every two hours (schedule interval is set to "PT2H"). To run an indexer every 30 minutes, set the interval to "PT30M". The shortest supported interval is 5 minutes. The schedule is optional - if omitted, an indexer runs only once when it's created. However, you can run an indexer on-demand at any time.
138-
139-
For more details on the Create Indexer API, check out [Create Indexer](/rest/api/searchservice/create-indexer).
140-
141-
For more information about defining indexer schedules, see [How to schedule indexers for Azure Cognitive Search](search-howto-schedule-indexers.md).
143+
This indexer runs immediately, and then [on a schedule](search-howto-schedule-indexers.md) every two hours (schedule interval is set to "PT2H"). To run an indexer every 30 minutes, set the interval to "PT30M". The shortest supported interval is 5 minutes. The schedule is optional - if omitted, an indexer runs only once when it's created. However, you can run an indexer on-demand at any time.
142144

143145
<a name="DocumentKeys"></a>
144146

@@ -265,7 +267,11 @@ It's important to point out that you don't need to define fields for all of the
265267

266268
## How to control which blobs are indexed
267269

268-
You can control which blobs are indexed, and which are skipped, by the blob's file type or by setting properties on the blob themselves, causing the indexer to skip over them.
270+
You can control which blobs are indexed, and which are skipped, by setting role assignments, the blob's file type, or by setting properties on the blob themselves, causing the indexer to skip over them.
271+
272+
### Use access controls and role assignments
273+
274+
Indexers that run under a system or user-assigned managed identity can have membership in either a Reader or Storage Blob Data Reader role that grants read permissions on specific files and folders.
269275

270276
### Include specific file extensions
271277

@@ -381,6 +387,7 @@ You can also set [blob configuration properties](/rest/api/searchservice/create-
381387

382388
## See also
383389

390+
+ [C# Sample: Index Data Lake Gen2 using Azure AD](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md)
384391
+ [Indexers in Azure Cognitive Search](search-indexer-overview.md)
385392
+ [Create an indexer](search-howto-create-indexers.md)
386393
+ [AI enrichment over blobs overview](search-blob-ai-integration.md)

articles/search/search-howto-managed-identities-storage.md

Lines changed: 34 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: markheff
77
ms.author: maheff
88
ms.service: cognitive-search
99
ms.topic: conceptual
10-
ms.date: 07/02/2021
10+
ms.date: 10/01/2021
1111
---
1212

1313
# Set up a connection to an Azure Storage account using a managed identity
@@ -16,15 +16,18 @@ This page describes how to set up an indexer connection to an Azure storage acco
1616

1717
You can use a system-assigned managed identity or a user-assigned managed identity (preview).
1818

19-
Before learning more about this feature, it is recommended that you have an understanding of what an indexer is and how to set up an indexer for your data source. More information can be found at the following links:
19+
This article assumes familiarity with indexer concepts and configuration. If you're new to indexers, start with these links:
20+
2021
* [Indexer overview](search-indexer-overview.md)
2122
* [Azure Blob indexer](search-howto-indexing-azure-blob-storage.md)
2223
* [Azure Data Lake Storage Gen2 indexer](search-howto-index-azure-data-lake-storage.md)
2324
* [Azure Table indexer](search-howto-indexing-azure-tables.md)
2425

2526
## 1 - Set up a managed identity
2627

27-
Set up the [managed identity](../active-directory/managed-identities-azure-resources/overview.md) using one of the following options.
28+
Set up the [managed identity](../active-directory/managed-identities-azure-resources/overview.md) for an Azure Cognitive Search service using one of the following options.
29+
30+
The search service must be Basic tier or above.
2831

2932
### Option 1 - Turn on system-assigned managed identity
3033

@@ -35,26 +38,32 @@ When a system-assigned managed identity is enabled, Azure creates an identity fo
3538
After selecting **Save** you will see an Object ID that has been assigned to your search service.
3639

3740
![Object ID](./media/search-managed-identities/system-assigned-identity-object-id.png "Object ID")
38-
41+
3942
### Option 2 - Assign a user-assigned managed identity to the search service (preview)
4043

4144
If you don't already have a user-assigned managed identity created, you'll need to create one. A user-assigned managed identity is a resource on Azure.
4245

4346
1. Sign into the [Azure portal](https://portal.azure.com/).
47+
4448
1. Select **+ Create a resource**.
49+
4550
1. In the "Search services and marketplace" search bar, search for "User Assigned Managed Identity" and then select **Create**.
51+
4652
1. Give the identity a descriptive name.
4753

4854
Next, assign the user-assigned managed identity to the search service. This can be done using the [2021-04-01-preview management API](/rest/api/searchmanagement/2021-04-01-preview/services/create-or-update).
4955

5056
The identity property takes a type and one or more fully-qualified user-assigned identities:
5157

52-
* **type** is the type of identity. Valid values are "SystemAssigned", "UserAssigned", or "SystemAssigned, UserAssigned" if you want to use both. A value of "None" will clear any previously assigned identities from the search service.
53-
* **userAssignedIdentities** includes the details of the user assigned managed identity.
54-
* User-assigned managed identity format:
55-
* /subscriptions/**subscription ID**/resourcegroups/**resource group name**/providers/Microsoft.ManagedIdentity/userAssignedIdentities/**name of managed identity**
58+
* **type** is the type of identity. Valid values are "SystemAssigned", "UserAssigned", or "SystemAssigned, UserAssigned" for both. A value of "None" will clear any previously assigned identities from the search service.
59+
60+
* **userAssignedIdentities** includes the details of the user assigned managed identity. The format is:
61+
62+
```bash
63+
/subscriptions/<your-subscription-ID>/resourcegroups/<your-resource-group-name>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<your-managed-identity-name>
64+
```
5665

57-
Example of how to assign a user-assigned managed identity to a search service:
66+
Example of a user-assigned managed identity assignment:
5867

5968
```http
6069
PUT https://management.azure.com/subscriptions/[subscription ID]/resourceGroups/[resource group name]/providers/Microsoft.Search/searchServices/[search service name]?api-version=2021-04-01-preview
@@ -81,20 +90,25 @@ Content-Type: application/json
8190

8291
## 2 - Add a role assignment
8392

84-
In this step you will either give your Azure Cognitive Search service or user-assigned managed identity permission to read data from your storage account.
93+
In this step, you will either give your Azure Cognitive Search service or user-assigned managed identity permission to read data from your storage account.
8594

8695
1. In the Azure portal, navigate to the Storage account that contains the data that you would like to index.
96+
8797
2. Select **Access control (IAM)**
98+
8899
3. Select **Add** then **Add role assignment**
89100

90101
![Add role assignment](./media/search-managed-identities/add-role-assignment-storage.png "Add role assignment")
91102

92103
4. Select the appropriate role(s) based on the storage account type that you would like to index:
93-
1. Azure Blob Storage requires that you add your search service to the **Storage Blob Data Reader** role.
94-
1. Azure Data Lake Storage Gen2 requires that you add your search service to the **Storage Blob Data Reader** role.
95-
1. Azure Table Storage requires that you add your search service to the **Reader and Data Access** role.
96-
5. Leave **Assign access to** as **Azure AD user, group or service principal**
97-
6. If you're using a system-assigned managed identity, search for your search service, then select it. If you're using a user-assigned managed identity, search for the name of the user-assigned managed identity, then select it. Select **Save**.
104+
105+
* Azure Blob Storage requires that you add your search service to the **Storage Blob Data Reader** role.
106+
* Azure Data Lake Storage Gen2 requires that you add your search service to the **Storage Blob Data Reader** role.
107+
* Azure Table Storage requires that you add your search service to the **Reader and Data Access** role.
108+
109+
5. Leave **Assign access to** as **Azure AD user, group or service principal**
110+
111+
6. If you're using a system-assigned managed identity, search for your search service, then select it. If you're using a user-assigned managed identity, search for the name of the user-assigned managed identity, then select it. Select **Save**.
98112

99113
Example for Azure Blob Storage and Azure Data Lake Storage Gen2 using a system-assigned managed identity:
100114

@@ -104,6 +118,8 @@ In this step you will either give your Azure Cognitive Search service or user-as
104118

105119
![Add reader and data access role assignment](./media/search-managed-identities/add-role-assignment-reader-and-data-access.png "Add reader and data access role assignment")
106120

121+
For a code examples in C#, see [Index Data Lake Gen2 using Azure AD](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md) on GitHub.
122+
107123
## 3 - Create the data source
108124

109125
Create the data source and provide either a system-assigned managed identity or a user-assigned managed identity (preview). Note that you are no longer using the Management REST API in the below steps.
@@ -237,14 +253,13 @@ For more details on the Create Indexer API, check out [Create Indexer](/rest/api
237253

238254
For more information about defining indexer schedules see [How to schedule indexers for Azure Cognitive Search](search-howto-schedule-indexers.md).
239255

240-
## Accessing secure data in storage accounts
256+
## Accessing network secured data in storage accounts
241257

242258
Azure storage accounts can be further secured using firewalls and virtual networks. If you want to index content from a blob storage account or Data Lake Gen2 storage account that is secured using a firewall or virtual network, follow the instructions for [Accessing data in storage accounts securely via trusted service exception](search-indexer-howto-access-trusted-service-exception.md).
243259

244260
## See also
245261

246-
Learn more about Azure Storage indexers:
247-
248262
* [Azure Blob indexer](search-howto-indexing-azure-blob-storage.md)
249263
* [Azure Data Lake Storage Gen2 indexer](search-howto-index-azure-data-lake-storage.md)
250-
* [Azure Table indexer](search-howto-indexing-azure-tables.md)
264+
* [Azure Table indexer](search-howto-indexing-azure-tables.md)
265+
* [C# Example: Index Data Lake Gen2 using Azure AD (GitHub)](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md)

0 commit comments

Comments
 (0)