You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-hadoop-use-data-lake-storage-gen2.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,12 +7,12 @@ ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
8
ms.topic: conceptual
9
9
ms.custom: hdinsightactive
10
-
ms.date: 02/20/2020
10
+
ms.date: 04/24/2020
11
11
---
12
12
13
13
# Use Azure Data Lake Storage Gen2 with Azure HDInsight clusters
14
14
15
-
Azure Data Lake Storage Gen2 is a cloud storage service dedicated to big data analytics, built on Azure Blob storage. Data Lake Storage Gen2 combines the capabilities of Azure Blob storage and Azure Data Lake Storage Gen1. The resulting service offers features from Azure Data Lake Storage Gen1, such as file system semantics, directory-level and file-level security, and scalability, along with the low-cost, tiered storage, high availability, and disaster-recovery capabilities from Azure Blob storage.
15
+
Azure Data Lake Storage Gen2 is a cloud storage service dedicated to big data analytics, built on Azure Blob storage. Data Lake Storage Gen2 combines the capabilities of Azure Blob storage and Azure Data Lake Storage Gen1. The resulting service offers features from Azure Data Lake Storage Gen1. These features include: file system semantics, directory-level and file-level security, and adaptability. Along with the low-cost, tiered storage, high availability, and disaster-recovery capabilities from Azure Blob storage.
16
16
17
17
## Data Lake Storage Gen2 availability
18
18
@@ -52,7 +52,7 @@ Create an Azure Data Lake Storage Gen2 storage account.
52
52
1. Click **Create**.
53
53
1. On the **Create storage account** screen:
54
54
1. Select the correct subscription and resource group.
55
-
1. Enter a name for your Data Lake Storage Gen2 account. For more information on storage account naming conventions, see [Naming conventions for Azure resources](/azure/azure-resource-manager/management/resource-name-rules#microsoftstorage).
55
+
1. Enter a name for your Data Lake Storage Gen2 account.
56
56
1. Click on the **Advanced** tab.
57
57
1. Click **Enabled** next to **Hierarchical namespace** under **Data Lake Storage Gen2**.
58
58
1. Click **Review + create**.
@@ -102,7 +102,7 @@ You can [download a sample template file](https://github.com/Azure-Samples/hdins
102
102
|`<MANAGEDIDENTITYNAME>`| The name of the managed identity that will be given permissions on your Azure Data Lake Storage Gen2 account. |
103
103
|`<STORAGEACCOUNTNAME>`| The new Azure Data Lake Storage Gen2 account that will be created. |
104
104
|`<CLUSTERNAME>`| The name of your HDInsight cluster. |
105
-
|`<PASSWORD>`| Your chosen password for signing in to the cluster using SSH as well as the Ambari dashboard. |
105
+
|`<PASSWORD>`| Your chosen password for signing in to the cluster using SSH and the Ambari dashboard. |
106
106
107
107
The code snippet below does the following initial steps:
Next, sign in to the portal. Add the new user-assigned managed identity to the **Storage Blob Data Contributor** role on the storage account, as described in step 3 under [Using the Azure portal](hdinsight-hadoop-use-data-lake-storage-gen2.md).
134
+
Next, sign in to the portal. Add the new user-assigned managed identity to the **Storage Blob Data Contributor** role on the storage account. This step is described in step 3 under [Using the Azure portal](hdinsight-hadoop-use-data-lake-storage-gen2.md).
135
135
136
136
After you've assigned the role for the user-assigned managed identity, deploy the template by using the following code snippet.
137
137
@@ -160,15 +160,15 @@ For more information about file permissions with ACLs, see [Access control lists
160
160
161
161
### How do I control access to my data in Data Lake Storage Gen2?
162
162
163
-
Your HDInsight cluster's ability to access files in Data Lake Storage Gen2 is controlled through managed identities. A managed identity is an identity registered in Azure Active Directory (Azure AD) whose credentials are managed by Azure. With managed identities, you don't need to register service principals in Azure AD or maintain credentials such as certificates.
163
+
Your HDInsight cluster's ability to access files in Data Lake Storage Gen2 is controlled through managed identities. A managed identity is an identity registered in Azure Active Directory (Azure AD) whose credentials are managed by Azure. With managed identities, you don't need to register service principals in Azure AD. Or maintain credentials such as certificates.
164
164
165
-
Azure services have two types of managed identities: system-assigned and user-assigned. HDInsight uses user-assigned managed identities to access Data Lake Storage Gen2. A user-assigned managed identity is created as a standalone Azure resource. Through a create process, Azure creates an identity in the Azure AD tenant that's trusted by the subscription in use. After the identity is created, the identity can be assigned to one or more Azure service instances.
165
+
Azure services have two types of managed identities: system-assigned and user-assigned. HDInsight uses user-assigned managed identities to access Data Lake Storage Gen2. A `user-assigned managed identity` is created as a standalone Azure resource. Through a create process, Azure creates an identity in the Azure AD tenant that's trusted by the subscription in use. After the identity is created, the identity can be assigned to one or more Azure service instances.
166
166
167
167
The lifecycle of a user-assigned identity is managed separately from the lifecycle of the Azure service instances to which it's assigned. For more information about managed identities, see [How do the managed identities for Azure resources work?](../active-directory/managed-identities-azure-resources/overview.md#how-does-the-managed-identities-for-azure-resources-work).
168
168
169
169
### How do I set permissions for Azure AD users to query data in Data Lake Storage Gen2 by using Hive or other services?
170
170
171
-
To set permissions for users to query data, use Azure AD security groups as the assigned principal in ACLs. Don't directly assign file-access permissions to individual users or service principals. When you use Azure AD security groups to control the flow of permissions, you can add and remove users or service principals without reapplying ACLs to an entire directory structure. You only have to add or remove the users from the appropriate Azure AD security group. ACLs aren't inherited, so reapplying ACLs requires updating the ACL on every file and subdirectory.
171
+
To set permissions for users to query data, use Azure AD security groups as the assigned principal in ACLs. Don't directly assign file-access permissions to individual users or service principals. With Azure AD security groups to control the flow of permissions, you can add and remove users or service principals without reapplying ACLs to an entire directory structure. You only have to add or remove the users from the appropriate Azure AD security group. ACLs aren't inherited, so reapplying ACLs requires updating the ACL on every file and subdirectory.
0 commit comments