You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/domain-joined/apache-domain-joined-configure-using-azure-adds.md
+22-19Lines changed: 22 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Configure a HDInsight cluster with Enterprise Security Package by using Azure ADDS
2
+
title: Configure a HDInsight cluster with Enterprise Security Package by using Azure AD-DS
3
3
description: Learn how to set up and configure a HDInsight Enterprise Security Package cluster by using Azure Active Directory Domain Services
4
4
services: hdinsight
5
5
ms.service: hdinsight
@@ -13,16 +13,16 @@ ms.date: 09/24/2018
13
13
14
14
Enterprise Security Package (ESP) clusters provide multi-user access on Azure HDInsight clusters. HDInsight clusters with ESP are connected to a domain so that domain users can use their domain credentials to authenticate with the clusters and run big data jobs.
15
15
16
-
In this article, you learn how to configure a HDInsight cluster with ESP by using Azure Active Directory Domain Services (Azure ADDS).
16
+
In this article, you learn how to configure a HDInsight cluster with ESP by using Azure Active Directory Domain Services (Azure AD-DS).
17
17
18
-
## Enable Azure ADDS
18
+
## Enable Azure AD-DS
19
19
20
-
Enabling Azure ADDS is a prerequisite before you can create a HDInsight cluster with ESP. For more information, see [Enable Azure Active Directory Domain Services using the Azure portal](../../active-directory-domain-services/active-directory-ds-getting-started.md).
20
+
Enabling Azure AD-DS is a prerequisite before you can create a HDInsight cluster with ESP. For more information, see [Enable Azure Active Directory Domain Services using the Azure portal](../../active-directory-domain-services/active-directory-ds-getting-started.md).
21
21
22
22
> [!NOTE]
23
-
> Only tenant administrators have the privileges to create an Azure ADDS instance. If you use Azure Data Lake Storage Gen1 as the default storage for HDInsight, make sure that the default Azure AD tenant for Data Lake Storage Gen1 is same as the domain for the HDInsight cluster. Because Hadoop relies on Kerberos and basic authentication, multi-factor authentication needs to be disabled for users who will access the cluster.
23
+
> Only tenant administrators have the privileges to create an Azure AD-DS instance. If you use Azure Data Lake Storage Gen1 as the default storage for HDInsight, make sure that the default Azure AD tenant for Data Lake Storage Gen1 is same as the domain for the HDInsight cluster. Because Hadoop relies on Kerberos and basic authentication, multi-factor authentication needs to be disabled for users who will access the cluster.
24
24
25
-
After you provision the Azure ADDS instance, create a service account in Azure Active Directory (Azure AD) with the right permissions. If this service account already exists, reset its password and wait until it syncs to Azure ADDS. This reset will result in the creation of the Kerberos password hash, and it might take up to 30 minutes to sync to Azure ADDS.
25
+
After you provision the Azure AD-DS instance, create a service account in Azure Active Directory (Azure AD) with the right permissions. If this service account already exists, reset its password and wait until it syncs to Azure AD-DS. This reset will result in the creation of the Kerberos password hash, and it might take up to 30 minutes to sync to Azure AD-DS.
26
26
27
27
The service account needs the following privileges:
28
28
@@ -32,36 +32,39 @@ The service account needs the following privileges:
32
32
> [!NOTE]
33
33
> Because Apache Zeppelin uses the domain name to authenticate the administrative service account, the service account *must* have the same domain name as its UPN suffix for Apache Zeppelin to function properly.
34
34
35
-
To learn more about OUs and how to manage them, see [Create an OU on an Azure ADDS managed domain](../../active-directory-domain-services/active-directory-ds-admin-guide-create-ou.md).
35
+
To learn more about OUs and how to manage them, see [Create an OU on an Azure AD-DS managed domain](../../active-directory-domain-services/active-directory-ds-admin-guide-create-ou.md).
36
36
37
-
Secure LDAP is for an Azure ADDS managed domain. For more information, see [Configure secure LDAP for an Azure ADDS managed domain](../../active-directory-domain-services/active-directory-ds-admin-guide-configure-secure-ldap.md).
37
+
Secure LDAP is for an Azure AD-DS managed domain. For more information, see [Configure secure LDAP for an Azure AD-DS managed domain](../../active-directory-domain-services/active-directory-ds-admin-guide-configure-secure-ldap.md).
38
38
39
39
## Create a HDInsight cluster with ESP
40
40
41
-
The next step is to create the HDInsight cluster by using Azure ADDS and the service account that you created in the previous section.
41
+
The next step is to create the HDInsight cluster with ESP enabled using Azure AD-DS and the service account that you created in the previous section.
42
42
43
-
It's easier to place both the Azure ADDS instance and the HDInsight cluster in the same Azure virtual network. If you choose to put them in different virtual networks, you must peer those virtual networks so that HDInsight VMs have a line of sight to the domain controller for joining the VMs. For more information, see [Virtual network peering](../../virtual-network/virtual-network-peering-overview.md).
43
+
It's easier to place both the Azure AD-DS instance and the HDInsight cluster in the same Azure virtual network. If you choose to put them in different virtual networks, you must peer those virtual networks so that HDInsight VMs have a line of sight to the domain controller for joining the VMs. For more information, see [Virtual network peering](../../virtual-network/virtual-network-peering-overview.md).
44
44
45
-
When you create a HDInsight clusterwith ESP, you must supply the following parameters:
45
+
When you create an HDInsight cluster, you have the option to enable Enterprise Security Package to connect your cluster with Azure AD-DS. ESP is only available in HDI 3.6+ for Spark, Interactive, Hadoop, and HBase cluster types.
46
46
47
-
-**Domain name**: The domain name that's associated with Azure AD DS. An example is contoso.onmicrosoft.com.
47
+

48
48
49
-
-**Domain user name**: The service account in the Azure ADDS DC managed domain that you created in the previous section. An example is [email protected]. This domain user will be the administrator of this HDInsight cluster.
49
+
Once you enable ESP, common misconfigurations related to Azure AD-DS will be automatically detected and validated.
50
50
51
-
-**Domain password**: The password of the service account.
-**Organizational unit**: The distinguished name of the OU that you want to use with the HDInsight cluster. An example is OU=HDInsightOU,DC=contoso,DC=onmicrosoft,DC=com. If this OU does not exist, the HDInsight cluster tries to create the OU by using the privileges that the service account has. For example, if the service account is in the Azure AD DS Administrators group, it has the right permissions to create an OU. Otherwise, you need to create the OU first and give the service account full control over that OU. For more information, see [Create an OU on an Azure AD DS managed domain](../../active-directory-domain-services/active-directory-ds-admin-guide-create-ou.md).
53
+
Early detection saves time by allowing you to fix errors before creating the cluster.
54
54
55
-
> [!IMPORTANT]
56
-
> Include all of the DCs, separated by commas, after the OU (for example, OU=HDInsightOU,DC=contoso,DC=onmicrosoft,DC=com).
When you create a HDInsight cluster with ESP, you must supply the following parameters:
58
+
59
+
-**Cluster admin user**: Choose an admin for your cluster from your list of Active Directory users.
60
+
61
+
-**Cluster access groups**: The security groups whose users you want to sync to the cluster. For example, HiveUsers. If you want to specify multiple user groups, separate them by semicolon ‘;’. The group(s) must exist in the directory prior to provisioning. For more information, see [Create a group and add members in Azure Active Directory](../../active-directory/fundamentals/active-directory-groups-create-azure-portal.md). If the group does not exist, an error occurs: "Group HiveUsers not found in the Active Directory."
57
62
58
63
-**LDAPS URL**: An example is ldaps://contoso.onmicrosoft.com:636.
59
64
60
65
> [!IMPORTANT]
61
66
> Enter the complete URL, including "ldaps://" and the port number (:636).
62
67
63
-
-**Access user group**: The security groups whose users you want to sync to the cluster. For example, HiveUsers. If you want to specify multiple user groups, separate them by semicolon ‘;’. The group(s) must exist in the directory prior to provisioning. For more information, see [Create a group and add members in Azure Active Directory](../../active-directory/fundamentals/active-directory-groups-create-azure-portal.md). If the group does not exist, an error occurs: "Group HiveUsers not found in the Active Directory."
64
-
65
68
The following screenshot shows the configurations in the Azure portal:
66
69
67
70
.
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-authorize-users-to-ambari.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: Authorize users for Ambari Views - Azure HDInsight
3
-
description: 'How to manage Ambari user and group permissions for domain-joined HDInsight clusters.'
3
+
description: 'How to manage Ambari user and group permissions for HDInsight clusters with ESP enabled.'
4
4
services: hdinsight
5
5
author: maxluk
6
6
ms.reviewer: jasonh
@@ -13,14 +13,14 @@ ms.author: maxluk
13
13
---
14
14
# Authorize users for Ambari Views
15
15
16
-
[Domain-joined HDInsight clusters](./domain-joined/apache-domain-joined-introduction.md) provide enterprise-grade capabilities, including Azure Active Directory-based authentication. You can [synchronize new users](hdinsight-sync-aad-users-to-cluster.md) added to Azure AD groups that have been provided access to the cluster, allowing those specific users to perform certain actions. Working with users, groups, and permissions in Ambari is supported for both domain-joined HDInsight cluster and standard HDInsight cluster.
16
+
[Enterprise Security Package (ESP) enabled HDInsight clusters](./domain-joined/apache-domain-joined-introduction.md) provide enterprise-grade capabilities, including Azure Active Directory-based authentication. You can [synchronize new users](hdinsight-sync-aad-users-to-cluster.md) added to Azure AD groups that have been provided access to the cluster, allowing those specific users to perform certain actions. Working with users, groups, and permissions in Ambari is supported for both ESP HDInsight clusters and standard HDInsight clusters.
17
17
18
18
Active Directory users can log on to the cluster nodes using their domain credentials. They can also use their domain credentials to authenticate cluster interactions with other approved endpoints like Hue, Ambari Views, ODBC, JDBC, PowerShell, and REST APIs.
19
19
20
20
> [!WARNING]
21
21
> Do not change the password of the Ambari watchdog (hdinsightwatchdog) on your Linux-based HDInsight cluster. Changing the password breaks the ability to use script actions or perform scaling operations with your cluster.
22
22
23
-
If you have not already done so, follow [these instructions](./domain-joined/apache-domain-joined-configure.md) to provision a new domain-joined cluster.
23
+
If you have not already done so, follow [these instructions](./domain-joined/apache-domain-joined-configure.md) to provision a new ESP cluster.
24
24
25
25
## Access the Ambari management page
26
26
@@ -113,7 +113,7 @@ The List view provides quick editing capabilities in two categories: Users and G
113
113
114
114

115
115
116
-
* The Groups category of the List view displays all groups, and the role assigned to each group. In our example, the list of groups is synchronized from the Azure AD groups specified in the **Access user group** property of the cluster's Domain settings. See [Create a Domain-joined HDInsight cluster](./domain-joined/apache-domain-joined-configure-using-azure-adds.md#create-a-domain-joined-hdinsight-cluster).
116
+
*The Groups category of the List view displays all groups, and the role assigned to each group. In our example, the list of groups is synchronized from the Azure AD groups specified in the **Access user group** property of the cluster's Domain settings. See [Create a HDInsight cluster with ESP enabled](./domain-joined/apache-domain-joined-configure-using-azure-adds.md#create-a-hdinsight-cluster-with-esp).
117
117
118
118

119
119
@@ -133,7 +133,7 @@ We have assigned our Azure AD domain user "hiveuser2" to the *Cluster User* role
133
133
134
134
## Next steps
135
135
136
-
*[Configure Hive policies in Domain-joined HDInsight](./domain-joined/apache-domain-joined-run-hive.md)
description: Compare HDInsight 3.6 to HDInsight 4.0 features, limitations, and upgrade recommendations.
4
+
ms.service: hdinsight
5
+
author: mamccrea
6
+
ms.author: mamccrea
7
+
ms.reviewer: mamccrea
8
+
ms.topic: overview
9
+
ms.date: 09/24/2018
10
+
---
11
+
12
+
# HDInsight 4.0 overview (Preview)
13
+
14
+
Azure HDInsight is one of the most popular services among enterprise customers for open-source Hadoop and Spark analytics on Azure. HDInsight (HDI) 4.0 is a cloud distribution of the Hadoop components from the [Hortonworks Data Platform (HDP) 3.0](https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/release-notes/content/relnotes.html). This article provides information about the most recent Azure HDInsight release and how to upgrade.
15
+
16
+
## What's new in HDI 4.0?
17
+
18
+
### Hive 3.0 and LLAP
19
+
20
+
Hive low-latency analytical processing (LLAP) uses persistent query servers and in-memory caching to deliver quick SQL query results on data in remote cloud storage. Hive LLAP leverages a set of persistent daemons that execute fragments of Hive queries. Query execution on LLAP is similar to Hive without LLAP, with worker tasks running inside LLAP daemons instead of containers.
21
+
22
+
Benefits of Hive LLAP include:
23
+
24
+
* Ability to perform deep SQL analytics, such as complex joins, subqueries, windowing functions, sorting, user-defined functions, and complex aggregations, without sacrificing performance and scalability.
25
+
26
+
* Interactive queries against data in the same storage where data is prepared, eliminating the need to move data from storage to another engine for analytical processing.
27
+
28
+
* Caching query results allows previously computed query results to be reused, which saves time and resources spent running the cluster tasks required for the query.
29
+
30
+
### Hive dynamic materialized views
31
+
32
+
Hive now supports dynamic materialized views, or pre-computation of relevant summaries, used to accelerate query processing in data warehouses. Materialized views can be stored natively in Hive, and can seamlessly use LLAP acceleration.
33
+
34
+
### Hive transactional tables
35
+
36
+
HDI 4.0 includes Apache Hive 3, which requires atomicity, consistency, isolation, and durability (ACID) compliance for transactional tables that reside in the Hive warehouse. ACID-compliant tables and table data are accessed and managed by Hive. Data in create, retrieve, update, and delete (CRUD) tables must be in Optimized Row Column (ORC) file format, but insert-only tables support all file formats.
37
+
38
+
* ACID v2 has performance improvements in both storage format and the execution engine.
39
+
40
+
* ACID is enabled by default to allow full support for data updates.
41
+
42
+
* Improved ACID capabilities allow you to update and delete at row level.
43
+
44
+
* No Performance overhead.
45
+
46
+
* No Bucketing required.
47
+
48
+
* Spark can read and write to Hive ACID tables via Hive Warehouse Connector.
49
+
50
+
Learn more about [Apache Hive 3](https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/hive-overview/content/hive_whats_new_in_this_release_hive.html).
51
+
52
+
### Apache Spark
53
+
54
+
Apache Spark gets updatable tables and ACID transactions with Hive Warehouse Connector. Hive Warehouse Connector allows you to register Hive transactional tables as external tables in Spark to access full transactional functionality. Previous versions only supported table partition manipulation. Hive Warehouse Connector also supports Streaming DataFrames for streaming reads and writes into transactional and streaming Hive tables from Spark.
55
+
56
+
Spark executors can connect directly to Hive LLAP daemons to retrieve and update data in a transactional manner, allowing Hive to keep control of the data.
57
+
58
+
Apache Spark on HDInsight 4.0 supports the following scenarios:
59
+
60
+
* Run machine learning model training over the same transactional table used for reporting.
61
+
* Use ACID transactions to safely add columns from Spark ML to a Hive table.
62
+
* Run a Spark streaming job on the change feed from a Hive streaming table.
63
+
* Create ORC files directly from a Spark Structured Streaming job.
64
+
65
+
You no longer have to worry about accidentally trying to access Hive transactional tables directly from Spark, resulting in inconsistent results, duplicate data, or data corruption. In HDI 4.0, Spark tables and Hive tables are kept in separate Metastores. Use Hive Data Warehouse Connector to explicitly register Hive transactional tables as Spark external tables.
66
+
67
+
Learn more about [Apache Spark](https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/spark-overview/content/analyzing_data_with_apache_spark.html).
68
+
69
+
70
+
### Oozie
71
+
72
+
Apache Oozie 4.3.1 is included in HDI 4.0 with the following changes:
73
+
74
+
* Oozie no longer runs Hive actions. Hive CLI has been removed and replaced with BeeLine.
75
+
76
+
* You can exclude unwanted dependencies from share lib by including an exclude pattern in your **job.properties** file.
77
+
78
+
Learn more about [Apache Oozie](https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/release-notes/content/patch_oozie.html).
79
+
80
+
## How to upgrade to HDI 4.0
81
+
82
+
As with any major release, it's important to thoroughly test your components before implementing the latest version in a production environment. HDI 4.0 is available for you to begin the upgrade process, but HDI 3.6 is the default option to prevent accidental mishaps.
83
+
84
+
There is no supported upgrade path from previous versions of HDI to HDI 4.0. Because Metastore and blob data formats have changed, HDI 4.0 is not compatible with previous versions. It is important that you keep your new HDI 4.0 environment separate from your current production environment. If you deploy HDI 4.0 to your current environment, your Metastore will be upgraded and cannot be reversed.
85
+
86
+
## Limitations
87
+
88
+
* HDI 4.0 does not support MapReduce. Use Tez instead. Learn more about [Apache Tez](https://tez.apache.org/).
0 commit comments