You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/domain-joined/domain-joined-authentication-issues.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,16 +3,16 @@ title: Authentication issues in Azure HDInsight
3
3
description: Authentication issues in Azure HDInsight
4
4
ms.service: hdinsight
5
5
ms.topic: troubleshooting
6
-
ms.date: 05/09/2024
6
+
ms.date: 07/09/2024
7
7
---
8
8
9
9
# Authentication issues in Azure HDInsight
10
10
11
11
This article describes troubleshooting steps and possible resolutions for issues when interacting with Azure HDInsight clusters.
12
12
13
-
On secure clusters backed by Azure Data Lake (Gen1 or Gen2), when domain users sign in to the cluster services through HDI Gateway (like signing in to the Apache Ambari portal), HDI Gateway tries to obtain an OAuth token from Microsoft Entra first, and then get a Kerberos ticket from Microsoft Entra Domain Services. Authentication can fail in either of these stages. This article is aimed at debugging some of those issues.
13
+
On secure clusters backed by Azure Data Lake Gen2, when domain users sign in to the cluster services through HDI Gateway (like signing in to the Apache Ambari portal), HDI Gateway tries to obtain an OAuth token from Microsoft Entra first, and then get a Kerberos ticket from Microsoft Entra Domain Services. Authentication can fail in either of these stages. This article is aimed at debugging some of those issues.
14
14
15
-
When the authentication fails, you gets prompted for credentials. If you cancel this dialog, the error message is printed. Here are some of the common error messages:
15
+
When the authentication fails, you get prompted for credentials. If you cancel this dialog, the error message is printed. Here are some of the common error messages:
16
16
17
17
## invalid_grant or unauthorized_client, 50126
18
18
@@ -118,7 +118,7 @@ Sign in denied.
118
118
119
119
### Cause
120
120
121
-
To get to this stage, your OAuth authentication isn't an issue, but Kerberos authentication is. If this cluster is backed by ADLS, OAuth signin has succeeded before Kerberos auth is attempted. On WASB clusters, OAuth signin isn't attempted. There could be many reasons for Kerberos failure - like password hashes are out of sync, user account locked out in Microsoft Entra Domain Services, and so on. Password hashes sync only when the user changes password. When you create the Microsoft Entra Domain Services instance, it will start syncing passwords that are changed after the creation. It can't retroactively sync passwords that were set before its inception.
121
+
To get to this stage, your OAuth authentication isn't an issue, but Kerberos authentication is. If this cluster backed by ADLS, OAuth sign-in succeeded before Kerberos auth is attempted. On WASB clusters, OAuth sign-in isn't attempted. There could be many reasons for Kerberos failure - like password hashes are out of sync, user account locked out in Microsoft Entra Domain Services, and so on. Password hashes sync only when the user changes password. When you create the Microsoft Entra Domain Services instance, it will start syncing passwords that are changed after the creation. It can't retroactively sync passwords that were set before its inception.
122
122
123
123
### Resolution
124
124
@@ -128,7 +128,7 @@ Try to SSH into a You need to try to authenticate (kinit) using the same user cr
128
128
129
129
---
130
130
131
-
## kinit fails
131
+
## Kinit fails
132
132
133
133
### Issue
134
134
@@ -154,7 +154,7 @@ Ways to find `sAMAccountName`:
This error occurs intermittently when users try to access the ADLS Gen2 using ACLs and the Kerberos token has expired.
197
+
This error occurs intermittently when users try to access the ADLS Gen2 using ACLs and the Kerberos token expired.
198
198
199
199
### Resolution
200
200
201
201
* For Azure Data Lake Storage Gen1, clean browser cache and log into Ambari again.
202
202
203
-
* For Azure Data Lake Storage Gen2, Run `/usr/lib/hdinsight-common/scripts/RegisterKerbTicketAndOAuth.sh <upn>`for the user the user is trying to login as
203
+
* For Azure Data Lake Storage Gen2, Run `/usr/lib/hdinsight-common/scripts/RegisterKerbTicketAndOAuth.sh <upn>` user is trying to log in as
Copy file name to clipboardExpand all lines: articles/hdinsight/domain-joined/hdinsight-security-overview.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: Overview of enterprise security in Azure HDInsight
3
3
description: Learn the various methods to ensure enterprise security in Azure HDInsight.
4
4
ms.service: hdinsight
5
5
ms.topic: overview
6
-
ms.date: 06/15/2024
6
+
ms.date: 07/23/2024
7
7
#Customer intent: As a user of Azure HDInsight, I want to learn the means that Azure HDInsight offers to ensure security for the enterprise.
8
8
---
9
9
@@ -67,7 +67,7 @@ The following table provides links to resources for each type of security soluti
67
67
68
68
| Security area | Solutions available | Responsible party |
69
69
|---|---|---|
70
-
| Data Access Security | Configure [access control lists ACLs](../../storage/blobs/data-lake-storage-access-control.md) for Azure Data Lake Storage Gen1 and Gen2 | Customer |
70
+
| Data Access Security | Configure [access control lists ACLs](../../storage/blobs/data-lake-storage-access-control.md) for Azure Data Lake Storage Gen2 | Customer |
71
71
|| Enable the ["Secure transfer required"](../../storage/common/storage-require-secure-transfer.md) property on storage accounts. | Customer |
|| Configure [Azure virtual network service endpoints](../../virtual-network/virtual-network-service-endpoints-overview.md) for Azure Cosmos DB and [Azure SQL DB](/azure/azure-sql/database/vnet-service-endpoint-rule-overview)| Customer |
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-introduction.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,15 +4,15 @@ description: An introduction to HDInsight, and the Apache Hadoop technology stac
4
4
ms.service: hdinsight
5
5
ms.topic: overview
6
6
ms.custom: hdinsightactive, mvc
7
-
ms.date: 05/09/2024
7
+
ms.date: 07/23/2024
8
8
#Customer intent: As a data analyst, I want understand what is Hadoop and how it is offered in Azure HDInsight so that I can decide on using HDInsight instead of on premises clusters.
9
9
---
10
10
11
11
# What is Apache Hadoop in Azure HDInsight?
12
12
13
13
[Apache Hadoop](https://hadoop.apache.org/) was the original open-source framework for distributed processing and analysis of big data sets on clusters. The Hadoop ecosystem includes related software and utilities, including Apache Hive, Apache HBase, Spark, Kafka, and many others.
14
14
15
-
Azure HDInsight is a fully managed, full-spectrum, open-source analytics service in the cloud for enterprises. The Apache Hadoop cluster type in Azure HDInsight allows you to use the [Apache Hadoop Distributed File System (HDFS)](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html), [Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) resource management, and a simple [MapReduce](https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html) programming model to process and analyze batch data in parallel. Hadoop clusters in HDInsight are compatible with [Azure Blob storage](../../storage/common/storage-introduction.md), [Azure Data Lake Storage Gen1](../../data-lake-store/data-lake-store-overview.md), or [Azure Data Lake Storage Gen2](../../storage/blobs/data-lake-storage-introduction.md).
15
+
Azure HDInsight is a fully managed, full-spectrum, open-source analytics service in the cloud for enterprises. The Apache Hadoop cluster type in Azure HDInsight allows you to use the [Apache Hadoop Distributed File System (HDFS)](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html), [Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) resource management, and a simple [MapReduce](https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html) programming model to process and analyze batch data in parallel. Hadoop clusters in HDInsight are compatible with [Azure Data Lake Storage Gen2](../../storage/blobs/data-lake-storage-introduction.md).
16
16
17
17
To see available Hadoop technology stack components on HDInsight, see [Components and versions available with HDInsight](../hdinsight-component-versioning.md). To read more about Hadoop in HDInsight, see the [Azure features page for HDInsight](https://azure.microsoft.com/services/hdinsight/).
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-linux-create-cluster-get-started-portal.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,9 +41,9 @@ In this section, you create a Hadoop cluster in HDInsight using the Azure portal
41
41
|Region | From the drop-down list, select a region where the cluster is created. Choose a location closer to you for better performance. |
42
42
|Cluster type| Select **Select cluster type**. Then select **Hadoop** as the cluster type.|
43
43
|Version|From the drop-down list, select a **version**. Use the default version if you don't know what to choose.|
44
-
|Cluster login username and password | The default login name is **admin**. The password must be at least 10 characters in length and must contain at least one digit, one uppercase, and one lower case letter, one non-alphanumeric character (except characters ```' ` "```). Make sure you **do not provide** common passwords such as "Pass@word1".|
44
+
|Cluster sign in username and password | The default sign in name is **admin**. The password must be at least 10 characters in length and must contain at least one digit, one uppercase, and one lower case letter, one nonalphanumeric character (except characters ```' ` "```). Make sure you **do not provide** common passwords such as "Pass@word1".|
45
45
|Secure Shell (SSH) username | The default username is `sshuser`. You can provide another name for the SSH username. |
46
-
|Use cluster login password for SSH| Select this check box to use the same password for SSH user as the one you provided for the cluster login user.|
46
+
|Use cluster sign in password for SSH| Select this check box to use the same password for SSH user as the one you provided for the cluster sign in user.|
47
47
48
48
:::image type="content" source="./media/apache-hadoop-linux-create-cluster-get-started-portal/azure-portal-cluster-basics.png" alt-text="HDInsight Linux get started provide cluster basic values." border="true":::
49
49
@@ -60,7 +60,7 @@ In this section, you create a Hadoop cluster in HDInsight using the Azure portal
60
60
61
61
:::image type="content" source="./media/apache-hadoop-linux-create-cluster-get-started-portal/azure-portal-cluster-storage.png" alt-text="HDInsight Linux get started provide cluster storage values." border="true":::
62
62
63
-
Each cluster has an [Azure Storage account](../hdinsight-hadoop-use-blob-storage.md), an [Azure Data Lake Gen1](../hdinsight-hadoop-use-data-lake-storage-gen1.md), or an [`Azure Data Lake Storage Gen2`](../hdinsight-hadoop-use-data-lake-storage-gen2.md) dependency. It's referred as the default storage account. HDInsight cluster and its default storage account must be colocated in the same Azure region. Deleting clusters doesn't delete the storage account.
63
+
Each cluster has an [Azure Storage account](../hdinsight-hadoop-use-blob-storage.md), or an [`Azure Data Lake Storage Gen2`](../hdinsight-hadoop-use-data-lake-storage-gen2.md) dependency. It's referred as the default storage account. HDInsight cluster and its default storage account must be colocated in the same Azure region. Deleting clusters doesn't delete the storage account.
64
64
65
65
Select the **Review + create** tab.
66
66
@@ -115,7 +115,7 @@ In this section, you create a Hadoop cluster in HDInsight using the Azure portal
115
115
116
116
:::image type="content" source="./media/apache-hadoop-linux-create-cluster-get-started-portal/hdinsight-linux-hive-view-save-results.png" alt-text="Save result of Apache Hive query." border="true":::
117
117
118
-
After you've completed a Hive job, you can [export the results to Azure SQL Database or SQL Server database](apache-hadoop-use-sqoop-mac-linux.md), you can also [visualize the results using Excel](apache-hadoop-connect-excel-power-query.md). For more information about using Hive in HDInsight, see [Use Apache Hive and HiveQL with Apache Hadoop in HDInsight to analyze a sample Apache log4j file](hdinsight-use-hive.md).
118
+
After you've completed a Hive job, you can [export the results to Azure SQL Database or SQL Server database](apache-hadoop-use-sqoop-mac-linux.md), you can also [visualize the results using Excel](apache-hadoop-connect-excel-power-query.md). For more information about using Hive in HDInsight, see [Use Apache Hive and HiveQL with Apache Hadoop in HDInsight to analyze a sample Apache Log4j file](hdinsight-use-hive.md).
119
119
120
120
## Clean up resources
121
121
@@ -130,7 +130,7 @@ After you complete the quickstart, you may want to delete the cluster. With HDIn
2. If you want to delete the cluster as well as the default storage account, select the resource group name (highlighted in the previous screenshot) to open the resource group page.
133
+
2. If you want to delete the cluster and the default storage account, select the resource group name (highlighted in the previous screenshot) to open the resource group page.
134
134
135
135
3. Select **Delete resource group** to delete the resource group, which contains the cluster and the default storage account. Note deleting the resource group deletes the storage account. If you want to keep the storage account, choose to delete the cluster only.
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-linux-tutorial-get-started.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,7 +66,7 @@ Two Azure resources are defined in the template:
66
66
67
67
## Review deployed resources
68
68
69
-
Once the cluster is created, you'll receive a **Deployment succeeded** notification with a **Go to resource** link. Your Resource group page will list your new HDInsight cluster and the default storage associated with the cluster. Each cluster has an [Azure Blob Storage](../hdinsight-hadoop-use-blob-storage.md) account, an [Azure Data Lake Storage Gen1](../hdinsight-hadoop-use-data-lake-storage-gen1.md), or an [`Azure Data Lake Storage Gen2`](../hdinsight-hadoop-use-data-lake-storage-gen2.md) dependency. It's referred as the default storage account. The HDInsight cluster and its default storage account must be colocated in the same Azure region. Deleting clusters doesn't delete the storage account.
69
+
Once the cluster is created, you'll receive a **Deployment succeeded** notification with a **Go to resource** link. Your Resource group page will list your new HDInsight cluster and the default storage associated with the cluster. Each cluster has an [Azure Blob Storage](../hdinsight-hadoop-use-blob-storage.md) account, or an [`Azure Data Lake Storage Gen2`](../hdinsight-hadoop-use-data-lake-storage-gen2.md) dependency. It's referred as the default storage account. The HDInsight cluster and its default storage account must be colocated in the same Azure region. Deleting clusters doesn't delete the storage account.
70
70
71
71
> [!NOTE]
72
72
> For other cluster creation methods and understanding the properties used in this quickstart, see [Create HDInsight clusters](../hdinsight-hadoop-provision-linux-clusters.md).
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-on-premises-migration-best-practices-storage.md
+1-10Lines changed: 1 addition & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Learn storage best practices for migrating on-premises Hadoop clust
4
4
ms.service: hdinsight
5
5
ms.topic: how-to
6
6
ms.custom: hdinsightactive
7
-
ms.date: 05/22/2024
7
+
ms.date: 07/24/2024
8
8
---
9
9
10
10
# Migrate on-premises Apache Hadoop clusters to Azure HDInsight
@@ -71,15 +71,6 @@ For more information, see the following articles:
71
71
-[Monitor, diagnose, and troubleshoot Microsoft Azure Storage](../../storage/common/storage-monitoring-diagnosing-troubleshooting.md)
72
72
-[Monitor a storage account in the Azure portal](../../storage/common/manage-storage-analytics-logs.md)
73
73
74
-
### Azure Data Lake Storage Gen1
75
-
76
-
Azure Data Lake Storage Gen1 implements HDFS and POSIX style access control model. It provides first class integration with Microsoft Entra ID for fine grained access control. There are no limits to the size of data that it can store, or its ability to run massively parallel analytics.
77
-
78
-
For more information, see the following articles:
79
-
80
-
-[Create HDInsight clusters with Data Lake Storage Gen1 using the Azure portal](../../data-lake-store/data-lake-store-hdinsight-hadoop-use-portal.md)
81
-
-[Use Data Lake Storage Gen1 with Azure HDInsight clusters](../hdinsight-hadoop-use-data-lake-storage-gen1.md)
82
-
83
74
### Azure Data Lake Storage Gen2
84
75
85
76
Azure Data Lake Storage Gen2 is the latest storage offering. It unifies the core capabilities from the first generation of Azure Data Lake Storage Gen1 with a Hadoop compatible file system endpoint directly integrated into Azure Blob Storage. This enhancement combines the scale and cost benefits of object storage with the reliability and performance typically associated only with on-premises file systems.
0 commit comments