You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/samples-dotnet.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: conceptual
11
-
ms.date: 01/27/2021
11
+
ms.date: 10/01/2021
12
12
---
13
13
14
14
# .NET (C#) code samples for Azure Cognitive Search
@@ -62,6 +62,7 @@ The following samples are also published by the Cognitive Search team, but are n
62
62
63
63
| Samples | Description |
64
64
|---------|-------------|
65
+
|[Index Data Lake Gen2 using Azure AD](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md)| Source code demonstrating indexer connections and indexing of Azure Data Lake Gen2 files and folders that are secured through Azure AD and role-based access controls. |
65
66
|[azure-search-power-skills](https://github.com/Azure-Samples/azure-search-power-skills)| Source code for consumable custom skills that you can incorporate in your won solutions. |
66
67
|[Knowledge Mining Solution Accelerator](/samples/azure-samples/azure-search-knowledge-mining/azure-search-knowledge-mining/)| Includes templates, support files, and analytical reports to help you prototype an end-to-end knowledge mining solution. |
67
68
|[Covid-19 Search App repository](https://github.com/liamca/covid19search)| Source code repository for the Cognitive Search based [Covid-19 Search App](https://covid19search.azurewebsites.net/)|
Copy file name to clipboardExpand all lines: articles/search/search-howto-index-azure-data-lake-storage.md
+19-12Lines changed: 19 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: markheff
7
7
ms.author: maheff
8
8
ms.service: cognitive-search
9
9
ms.topic: conceptual
10
-
ms.date: 05/17/2021
10
+
ms.date: 10/01/2021
11
11
---
12
12
13
13
# Index data from Azure Data Lake Storage Gen2
@@ -16,13 +16,19 @@ This article shows you how to configure an Azure Data Lake Storage Gen2 indexer
16
16
17
17
Azure Data Lake Storage Gen2 is available through Azure Storage. When setting up an Azure storage account, you have the option to enable [hierarchical namespace](../storage/blobs/data-lake-storage-namespace.md). This allows the collection of content in an account to be organized into a hierarchy of directories and nested subdirectories. By enabling hierarchical namespace, you enable [Azure Data Lake Storage Gen2](../storage/blobs/data-lake-storage-introduction.md).
18
18
19
+
Examples in this article use the portal and REST APIs. For examples in C#, see [Index Data Lake Gen2 using Azure AD](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md) on GitHub.
20
+
19
21
## Supported access tiers
20
22
21
23
Data Lake Storage Gen2 [access tiers](../storage/blobs/access-tiers-overview.md) include hot, cool, and archive. Only hot and cool can be accessed by indexers.
22
24
23
25
## Access control
24
26
25
-
Data Lake Storage Gen2 implements an [access control model](../storage/blobs/data-lake-storage-access-control.md) that supports both Azure role-based access control (Azure RBAC) and POSIX-like access control lists (ACLs). When indexing content from Data Lake Storage Gen2, Azure Cognitive Search will not extract the Azure RBAC and ACL information from the content. As a result, this information will not be included in your Azure Cognitive Search index.
27
+
Data Lake Storage Gen2 implements an [access control model](../storage/blobs/data-lake-storage-access-control.md) that supports both Azure role-based access control (Azure RBAC) and POSIX-like access control lists (ACLs). Access control lists are partially supported in Azure Cognitive Search scenarios:
28
+
29
+
+ Support for access control is enabled on indexer access to content in Data Lake Storage Gen2. For a search service that has a system or user-assigned managed identity, you can define role assignments that determine indexer access to specific files and folders in Azure Storage.
30
+
31
+
+ Support for document-level permissions on an index is not available. If your access controls vary the level of access on a per user basis, those permissions cannot be carried forward into a search index on your search service. All users have the same level of access to all searchable and retrievable content in the index.
26
32
27
33
If maintaining access control on each document in the index is important, it is up to the application developer to implement [security trimming](./search-security-trimming-for-azure-search.md).
28
34
@@ -34,11 +40,11 @@ The Azure Cognitive Search blob indexer can extract text from the following docu
The Azure portal supports importing data from Azure Data Lake Storage Gen2. To import data from Data Lake Storage Gen2, navigate to your Azure Cognitive Search service page in the Azure portal, select **Import data**, select **Azure Data Lake Storage Gen2**, then continue to follow the Import data flow to create your data source, skillset, index, and indexer.
40
46
41
-
## Getting started with the REST API
47
+
## Indexing with the REST API
42
48
43
49
The Data Lake Storage Gen2 indexer is supported by the REST API. Follow the instructions below to set up a data source, index, and indexer.
44
50
@@ -93,7 +99,7 @@ The SAS should have the list and read permissions on the container. For more inf
93
99
94
100
### Step 2 - Create an index
95
101
96
-
The index specifies the fields in a document, attributes, and other constructs that shape the search experience. All indexers require that you specify a search index definition as the destination. The following example creates a simple index using the [Create Index (REST API)](/rest/api/searchservice/create-index).
102
+
The index specifies the fields in a document, attributes, and other constructs that shape the search experience. All indexers require that you specify a search index definition as the destination. The following example uses the [Create Index (REST API)](/rest/api/searchservice/create-index).
97
103
98
104
```http
99
105
POST https://[service name].search.windows.net/indexes?api-version=2020-06-30
@@ -117,7 +123,7 @@ You could also add fields for any blob metadata that you want in the index. The
117
123
118
124
### Step 3 - Configure and run the indexer
119
125
120
-
Once the index and data source have been created, you're ready to create the indexer:
126
+
Once the index and data source have been created, you're ready to [create the indexer](/rest/api/searchservice/create-indexer):
121
127
122
128
```http
123
129
POST https://[service name].search.windows.net/indexers?api-version=2020-06-30
@@ -134,11 +140,7 @@ Once the index and data source have been created, you're ready to create the ind
134
140
}
135
141
```
136
142
137
-
This indexer runs every two hours (schedule interval is set to "PT2H"). To run an indexer every 30 minutes, set the interval to "PT30M". The shortest supported interval is 5 minutes. The schedule is optional - if omitted, an indexer runs only once when it's created. However, you can run an indexer on-demand at any time.
138
-
139
-
For more details on the Create Indexer API, check out [Create Indexer](/rest/api/searchservice/create-indexer).
140
-
141
-
For more information about defining indexer schedules, see [How to schedule indexers for Azure Cognitive Search](search-howto-schedule-indexers.md).
143
+
This indexer runs immediately, and then [on a schedule](search-howto-schedule-indexers.md) every two hours (schedule interval is set to "PT2H"). To run an indexer every 30 minutes, set the interval to "PT30M". The shortest supported interval is 5 minutes. The schedule is optional - if omitted, an indexer runs only once when it's created. However, you can run an indexer on-demand at any time.
142
144
143
145
<aname="DocumentKeys"></a>
144
146
@@ -265,7 +267,11 @@ It's important to point out that you don't need to define fields for all of the
265
267
266
268
## How to control which blobs are indexed
267
269
268
-
You can control which blobs are indexed, and which are skipped, by the blob's file type or by setting properties on the blob themselves, causing the indexer to skip over them.
270
+
You can control which blobs are indexed, and which are skipped, by setting role assignments, the blob's file type, or by setting properties on the blob themselves, causing the indexer to skip over them.
271
+
272
+
### Use access controls and role assignments
273
+
274
+
Indexers that run under a system or user-assigned managed identity can have membership in either a Reader or Storage Blob Data Reader role that grants read permissions on specific files and folders.
269
275
270
276
### Include specific file extensions
271
277
@@ -381,6 +387,7 @@ You can also set [blob configuration properties](/rest/api/searchservice/create-
381
387
382
388
## See also
383
389
390
+
+[C# Sample: Index Data Lake Gen2 using Azure AD](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md)
384
391
+[Indexers in Azure Cognitive Search](search-indexer-overview.md)
385
392
+[Create an indexer](search-howto-create-indexers.md)
386
393
+[AI enrichment over blobs overview](search-blob-ai-integration.md)
Copy file name to clipboardExpand all lines: articles/search/search-howto-managed-identities-storage.md
+34-19Lines changed: 34 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: markheff
7
7
ms.author: maheff
8
8
ms.service: cognitive-search
9
9
ms.topic: conceptual
10
-
ms.date: 07/02/2021
10
+
ms.date: 10/01/2021
11
11
---
12
12
13
13
# Set up a connection to an Azure Storage account using a managed identity
@@ -16,15 +16,18 @@ This page describes how to set up an indexer connection to an Azure storage acco
16
16
17
17
You can use a system-assigned managed identity or a user-assigned managed identity (preview).
18
18
19
-
Before learning more about this feature, it is recommended that you have an understanding of what an indexer is and how to set up an indexer for your data source. More information can be found at the following links:
19
+
This article assumes familiarity with indexer concepts and configuration. If you're new to indexers, start with these links:
Set up the [managed identity](../active-directory/managed-identities-azure-resources/overview.md) using one of the following options.
28
+
Set up the [managed identity](../active-directory/managed-identities-azure-resources/overview.md) for an Azure Cognitive Search service using one of the following options.
29
+
30
+
The search service must be Basic tier or above.
28
31
29
32
### Option 1 - Turn on system-assigned managed identity
30
33
@@ -35,26 +38,32 @@ When a system-assigned managed identity is enabled, Azure creates an identity fo
35
38
After selecting **Save** you will see an Object ID that has been assigned to your search service.
### Option 2 - Assign a user-assigned managed identity to the search service (preview)
40
43
41
44
If you don't already have a user-assigned managed identity created, you'll need to create one. A user-assigned managed identity is a resource on Azure.
42
45
43
46
1. Sign into the [Azure portal](https://portal.azure.com/).
47
+
44
48
1. Select **+ Create a resource**.
49
+
45
50
1. In the "Search services and marketplace" search bar, search for "User Assigned Managed Identity" and then select **Create**.
51
+
46
52
1. Give the identity a descriptive name.
47
53
48
54
Next, assign the user-assigned managed identity to the search service. This can be done using the [2021-04-01-preview management API](/rest/api/searchmanagement/2021-04-01-preview/services/create-or-update).
49
55
50
56
The identity property takes a type and one or more fully-qualified user-assigned identities:
51
57
52
-
***type** is the type of identity. Valid values are "SystemAssigned", "UserAssigned", or "SystemAssigned, UserAssigned" if you want to use both. A value of "None" will clear any previously assigned identities from the search service.
53
-
***userAssignedIdentities** includes the details of the user assigned managed identity.
54
-
* User-assigned managed identity format:
55
-
* /subscriptions/**subscription ID**/resourcegroups/**resource group name**/providers/Microsoft.ManagedIdentity/userAssignedIdentities/**name of managed identity**
58
+
***type** is the type of identity. Valid values are "SystemAssigned", "UserAssigned", or "SystemAssigned, UserAssigned" for both. A value of "None" will clear any previously assigned identities from the search service.
59
+
60
+
***userAssignedIdentities** includes the details of the user assigned managed identity. The format is:
Example of how to assign a user-assigned managed identity to a search service:
66
+
Example of a user-assigned managed identity assignment:
58
67
59
68
```http
60
69
PUT https://management.azure.com/subscriptions/[subscription ID]/resourceGroups/[resource group name]/providers/Microsoft.Search/searchServices/[search service name]?api-version=2021-04-01-preview
In this step you will either give your Azure Cognitive Search service or user-assigned managed identity permission to read data from your storage account.
93
+
In this step, you will either give your Azure Cognitive Search service or user-assigned managed identity permission to read data from your storage account.
85
94
86
95
1. In the Azure portal, navigate to the Storage account that contains the data that you would like to index.
96
+
87
97
2. Select **Access control (IAM)**
98
+
88
99
3. Select **Add** then **Add role assignment**
89
100
90
101

91
102
92
103
4. Select the appropriate role(s) based on the storage account type that you would like to index:
93
-
1. Azure Blob Storage requires that you add your search service to the **Storage Blob Data Reader** role.
94
-
1. Azure Data Lake Storage Gen2 requires that you add your search service to the **Storage Blob Data Reader** role.
95
-
1. Azure Table Storage requires that you add your search service to the **Reader and Data Access** role.
96
-
5. Leave **Assign access to** as **Azure AD user, group or service principal**
97
-
6. If you're using a system-assigned managed identity, search for your search service, then select it. If you're using a user-assigned managed identity, search for the name of the user-assigned managed identity, then select it. Select **Save**.
104
+
105
+
* Azure Blob Storage requires that you add your search service to the **Storage Blob Data Reader** role.
106
+
* Azure Data Lake Storage Gen2 requires that you add your search service to the **Storage Blob Data Reader** role.
107
+
* Azure Table Storage requires that you add your search service to the **Reader and Data Access** role.
108
+
109
+
5. Leave **Assign access to** as **Azure AD user, group or service principal**
110
+
111
+
6. If you're using a system-assigned managed identity, search for your search service, then select it. If you're using a user-assigned managed identity, search for the name of the user-assigned managed identity, then select it. Select **Save**.
98
112
99
113
Example for Azure Blob Storage and Azure Data Lake Storage Gen2 using a system-assigned managed identity:
100
114
@@ -104,6 +118,8 @@ In this step you will either give your Azure Cognitive Search service or user-as
104
118
105
119

106
120
121
+
For a code examples in C#, see [Index Data Lake Gen2 using Azure AD](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md) on GitHub.
122
+
107
123
## 3 - Create the data source
108
124
109
125
Create the data source and provide either a system-assigned managed identity or a user-assigned managed identity (preview). Note that you are no longer using the Management REST API in the below steps.
@@ -237,14 +253,13 @@ For more details on the Create Indexer API, check out [Create Indexer](/rest/api
237
253
238
254
For more information about defining indexer schedules see [How to schedule indexers for Azure Cognitive Search](search-howto-schedule-indexers.md).
239
255
240
-
## Accessing secure data in storage accounts
256
+
## Accessing network secured data in storage accounts
241
257
242
258
Azure storage accounts can be further secured using firewalls and virtual networks. If you want to index content from a blob storage account or Data Lake Gen2 storage account that is secured using a firewall or virtual network, follow the instructions for [Accessing data in storage accounts securely via trusted service exception](search-indexer-howto-access-trusted-service-exception.md).
*[C# Example: Index Data Lake Gen2 using Azure AD (GitHub)](https://github.com/Azure-Samples/azure-search-dotnet-samples/blob/master/data-lake-gen2-acl-indexing/README.md)
0 commit comments