Skip to content

Commit 7d3f3b9

Browse files
authored
Merge pull request #112687 from dagiro/freshness_c44
freshness_c44
2 parents a77ee81 + f35b506 commit 7d3f3b9

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

articles/hdinsight/hdinsight-hadoop-use-data-lake-storage-gen2.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,12 @@ ms.reviewer: jasonh
77
ms.service: hdinsight
88
ms.topic: conceptual
99
ms.custom: hdinsightactive
10-
ms.date: 02/20/2020
10+
ms.date: 04/24/2020
1111
---
1212

1313
# Use Azure Data Lake Storage Gen2 with Azure HDInsight clusters
1414

15-
Azure Data Lake Storage Gen2 is a cloud storage service dedicated to big data analytics, built on Azure Blob storage. Data Lake Storage Gen2 combines the capabilities of Azure Blob storage and Azure Data Lake Storage Gen1. The resulting service offers features from Azure Data Lake Storage Gen1, such as file system semantics, directory-level and file-level security, and scalability, along with the low-cost, tiered storage, high availability, and disaster-recovery capabilities from Azure Blob storage.
15+
Azure Data Lake Storage Gen2 is a cloud storage service dedicated to big data analytics, built on Azure Blob storage. Data Lake Storage Gen2 combines the capabilities of Azure Blob storage and Azure Data Lake Storage Gen1. The resulting service offers features from Azure Data Lake Storage Gen1. These features include: file system semantics, directory-level and file-level security, and adaptability. Along with the low-cost, tiered storage, high availability, and disaster-recovery capabilities from Azure Blob storage.
1616

1717
## Data Lake Storage Gen2 availability
1818

@@ -52,7 +52,7 @@ Create an Azure Data Lake Storage Gen2 storage account.
5252
1. Click **Create**.
5353
1. On the **Create storage account** screen:
5454
1. Select the correct subscription and resource group.
55-
1. Enter a name for your Data Lake Storage Gen2 account. For more information on storage account naming conventions, see [Naming conventions for Azure resources](/azure/azure-resource-manager/management/resource-name-rules#microsoftstorage).
55+
1. Enter a name for your Data Lake Storage Gen2 account.
5656
1. Click on the **Advanced** tab.
5757
1. Click **Enabled** next to **Hierarchical namespace** under **Data Lake Storage Gen2**.
5858
1. Click **Review + create**.
@@ -102,7 +102,7 @@ You can [download a sample template file](https://github.com/Azure-Samples/hdins
102102
| `<MANAGEDIDENTITYNAME>` | The name of the managed identity that will be given permissions on your Azure Data Lake Storage Gen2 account. |
103103
| `<STORAGEACCOUNTNAME>` | The new Azure Data Lake Storage Gen2 account that will be created. |
104104
| `<CLUSTERNAME>` | The name of your HDInsight cluster. |
105-
| `<PASSWORD>` | Your chosen password for signing in to the cluster using SSH as well as the Ambari dashboard. |
105+
| `<PASSWORD>` | Your chosen password for signing in to the cluster using SSH and the Ambari dashboard. |
106106

107107
The code snippet below does the following initial steps:
108108

@@ -131,7 +131,7 @@ az storage account create --name <STORAGEACCOUNTNAME> \
131131
--kind StorageV2 --hierarchical-namespace true
132132
```
133133

134-
Next, sign in to the portal. Add the new user-assigned managed identity to the **Storage Blob Data Contributor** role on the storage account, as described in step 3 under [Using the Azure portal](hdinsight-hadoop-use-data-lake-storage-gen2.md).
134+
Next, sign in to the portal. Add the new user-assigned managed identity to the **Storage Blob Data Contributor** role on the storage account. This step is described in step 3 under [Using the Azure portal](hdinsight-hadoop-use-data-lake-storage-gen2.md).
135135

136136
After you've assigned the role for the user-assigned managed identity, deploy the template by using the following code snippet.
137137

@@ -160,15 +160,15 @@ For more information about file permissions with ACLs, see [Access control lists
160160

161161
### How do I control access to my data in Data Lake Storage Gen2?
162162

163-
Your HDInsight cluster's ability to access files in Data Lake Storage Gen2 is controlled through managed identities. A managed identity is an identity registered in Azure Active Directory (Azure AD) whose credentials are managed by Azure. With managed identities, you don't need to register service principals in Azure AD or maintain credentials such as certificates.
163+
Your HDInsight cluster's ability to access files in Data Lake Storage Gen2 is controlled through managed identities. A managed identity is an identity registered in Azure Active Directory (Azure AD) whose credentials are managed by Azure. With managed identities, you don't need to register service principals in Azure AD. Or maintain credentials such as certificates.
164164

165-
Azure services have two types of managed identities: system-assigned and user-assigned. HDInsight uses user-assigned managed identities to access Data Lake Storage Gen2. A user-assigned managed identity is created as a standalone Azure resource. Through a create process, Azure creates an identity in the Azure AD tenant that's trusted by the subscription in use. After the identity is created, the identity can be assigned to one or more Azure service instances.
165+
Azure services have two types of managed identities: system-assigned and user-assigned. HDInsight uses user-assigned managed identities to access Data Lake Storage Gen2. A `user-assigned managed identity` is created as a standalone Azure resource. Through a create process, Azure creates an identity in the Azure AD tenant that's trusted by the subscription in use. After the identity is created, the identity can be assigned to one or more Azure service instances.
166166

167167
The lifecycle of a user-assigned identity is managed separately from the lifecycle of the Azure service instances to which it's assigned. For more information about managed identities, see [How do the managed identities for Azure resources work?](../active-directory/managed-identities-azure-resources/overview.md#how-does-the-managed-identities-for-azure-resources-work).
168168

169169
### How do I set permissions for Azure AD users to query data in Data Lake Storage Gen2 by using Hive or other services?
170170

171-
To set permissions for users to query data, use Azure AD security groups as the assigned principal in ACLs. Don't directly assign file-access permissions to individual users or service principals. When you use Azure AD security groups to control the flow of permissions, you can add and remove users or service principals without reapplying ACLs to an entire directory structure. You only have to add or remove the users from the appropriate Azure AD security group. ACLs aren't inherited, so reapplying ACLs requires updating the ACL on every file and subdirectory.
171+
To set permissions for users to query data, use Azure AD security groups as the assigned principal in ACLs. Don't directly assign file-access permissions to individual users or service principals. With Azure AD security groups to control the flow of permissions, you can add and remove users or service principals without reapplying ACLs to an entire directory structure. You only have to add or remove the users from the appropriate Azure AD security group. ACLs aren't inherited, so reapplying ACLs requires updating the ACL on every file and subdirectory.
172172

173173
## Access files from the cluster
174174

0 commit comments

Comments
 (0)