You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-faq.yml
+22-22Lines changed: 22 additions & 22 deletions
Original file line number
Diff line number
Diff line change
@@ -18,14 +18,14 @@ sections:
18
18
- name: Creating or deleting HDInsight clusters
19
19
questions:
20
20
- question: |
21
-
How do I provision an HDInsight cluster?
21
+
How do I provision a HDInsight cluster?
22
22
answer: |
23
23
To review the HDInsight clusters types, and the provisioning methods, see [Set up clusters in HDInsight with Apache Hadoop, Apache Spark, Apache Kafka, and more](./hdinsight-hadoop-provision-linux-clusters.md).
24
24
25
25
- question: |
26
26
How do I delete an existing HDInsight cluster?
27
27
answer: |
28
-
To learn more about deleting a cluster when it's no longer in use, see [Delete an HDInsight cluster](hdinsight-delete-cluster.md).
28
+
To learn more about deleting a cluster when it's no longer in use, see [Delete a HDInsight cluster](hdinsight-delete-cluster.md).
29
29
30
30
Try to leave at least 30 to 60 minutes between create and delete operations. Otherwise the operation may fail with the following error message:
31
31
@@ -39,9 +39,9 @@ sections:
39
39
For more information, see [Capacity planning for HDInsight clusters](./hdinsight-capacity-planning.md).
40
40
41
41
- question: |
42
-
What are the various types of nodes in an HDInsight cluster?
42
+
What are the various types of nodes in a HDInsight cluster?
43
43
answer: |
44
-
See [Resource types in Azure HDInsight clusters](hdinsight-virtual-network-architecture.md#resource-types-in-azure-hdinsight-clusters).
44
+
See [Resource types in Azure HDInsight clusters](hdinsight-virtual-network-architecture.md#resource-types-in-azure-hdinsight-cluster).
45
45
46
46
- question: |
47
47
What are the best practices for creating large HDInsight clusters?
@@ -55,20 +55,20 @@ sections:
55
55
- name: Individual Components
56
56
questions:
57
57
- question: |
58
-
Can I install additional components on my cluster?
58
+
Can I install more components on my cluster?
59
59
answer: |
60
-
Yes. To install additional components or customize cluster configuration, use:
60
+
Yes. To install more components or customize cluster configuration, use:
61
61
62
62
- Scripts during or after creation. Scripts are invoked via [script action](./hdinsight-hadoop-customize-cluster-linux.md). Script action is a configuration option you can use from the Azure portal, HDInsight Windows PowerShell cmdlets, or the HDInsight .NET SDK. This configuration option can be used from the Azure portal, HDInsight Windows PowerShell cmdlets, or the HDInsight .NET SDK.
63
63
64
64
- [HDInsight Application Platform](https://azure.microsoft.com/services/hdinsight/partner-ecosystem/) to install applications.
65
65
66
-
For a list of supported components see [What are the Apache Hadoop components and versions available with HDInsight?](./hdinsight-component-versioning.md)
66
+
For a list of supported components, see [What are the Apache Hadoop components and versions available with HDInsight?](./hdinsight-component-versioning.md)
67
67
68
68
- question: |
69
-
Can I upgrade the individual components that are pre-installed on the cluster?
69
+
Can I upgrade the individual components that are preinstalled on the cluster?
70
70
answer: |
71
-
If you upgrade built-in components or applications that are pre-installed on your cluster, the resulting configuration won't be supported by Microsoft. These system configurations have not been tested by Microsoft. Try to use a different version of the HDInsight cluster that may already have the upgraded version of the component pre-installed.
71
+
If you upgrade built-in components or applications that are preinstalled on your cluster, the resulting configuration won't be supported by Microsoft. These system configurations haven't been tested by Microsoft. Try to use a different version of the HDInsight cluster that may already have the upgraded version of the component preinstalled.
72
72
73
73
For example, upgrading Hive as an individual component isn't supported. HDInsight is a managed service, and many services are integrated with Ambari server and tested. Upgrading a Hive on its own causes the indexed binaries of other components to change, and will cause component integration issues on your cluster.
3. In the User Settings window, select the new timezone from the Timezone drop down, and then click Save.
88
+
3. In the User Settings window, select the new timezone from the Timezone drop down, and then select Save.
89
89
90
90
:::image type="content" source="media/hdinsight-faq/ambari-user-settings.png" alt-text="Ambari User Settings.":::
91
91
@@ -110,7 +110,7 @@ sections:
110
110
- question: |
111
111
Does migrating a Hive metastore also migrate the default policies of the Ranger database?
112
112
answer: |
113
-
No, the policy definition is in the Ranger database, so migrating the Ranger database will migrate its policy.
113
+
No, the policy definition is in the Ranger database, so migrating the Ranger database migrates its policy.
114
114
115
115
- question: |
116
116
Can you migrate a Hive metastore from an Enterprise Security Package (ESP) cluster to a non-ESP cluster, and the other way around?
@@ -148,9 +148,9 @@ sections:
148
148
- [HDInsight management IP addresses](./hdinsight-management-ip-addresses.md)
149
149
150
150
- question: |
151
-
Can I deploy an additional virtual machine within the same subnet as an HDInsight cluster?
151
+
Can I deploy more virtual machine within the same subnet as a HDInsight cluster?
152
152
answer: |
153
-
Yes, you can deploy an additional virtual machine within the same subnet as an HDInsight cluster. The following configurations are possible:
153
+
Yes, you can deploy more virtual machine within the same subnet as a HDInsight cluster. The following configurations are possible:
154
154
155
155
- Edge nodes: You can add another edge node to the cluster, as described in [Use empty edge nodes on Apache Hadoop clusters in HDInsight](hdinsight-apps-use-edge-node.md).
156
156
@@ -175,7 +175,7 @@ sections:
175
175
For information on malware protection, see [Microsoft Antimalware for Azure Cloud Services and Virtual Machines](../security/fundamentals/antimalware.md).
176
176
177
177
- question: |
178
-
How do I create a keytab for an HDInsight ESP cluster?
178
+
How do I create a keytab for a HDInsight ESP cluster?
179
179
answer: |
180
180
Create a Kerberos keytab for your domain username. You can later use this keytab to authenticate to remote domain-joined clusters without entering a password. The domain name is uppercase:
181
181
@@ -196,7 +196,7 @@ sections:
196
196
- question: |
197
197
How do I determine the proper SALT value?
198
198
answer: |
199
-
1. Use an interactive Kerberos login to determine the proper salt value for the keytab. Interactive Kerberos login will use the highest encryption by default. Tracing should be enabled to observe the salt. Below is a sample Kerberos login:
199
+
1. Use an interactive Kerberos sign-in to determine the proper salt value for the keytab. Interactive Kerberos sign-in uses the highest encryption by default. Tracing should be enabled to observe the salt. Below is a sample Kerberos sign-in:
200
200
201
201
```shell
202
202
@@ -215,9 +215,9 @@ sections:
215
215
```
216
216
217
217
- question: |
218
-
Can I use an existing Microsoft Entra tenant to create an HDInsight cluster that has the ESP?
218
+
Can I use an existing Microsoft Entra tenant to create a HDInsight cluster that has the ESP?
219
219
answer: |
220
-
Enable Microsoft Entra Domain Services before you can create an HDInsight cluster with ESP. Open-source Hadoop relies on Kerberos for Authentication (as opposed to OAuth).
220
+
Enable Microsoft Entra Domain Services before you can create a HDInsight cluster with ESP. Open-source Hadoop relies on Kerberos for Authentication (as opposed to OAuth).
221
221
222
222
To join VMs to a domain, you must have a domain controller. Microsoft Entra Domain Services is the managed domain controller, and is considered an extension of Microsoft Entra ID. Microsoft Entra Domain Services provides all the Kerberos requirements to build a secure Hadoop cluster in a managed way. HDInsight as a managed service integrates with Microsoft Entra Domain Services to provide security.
223
223
@@ -236,7 +236,7 @@ sections:
236
236
No, DAS is not supported on ESP clusters.
237
237
238
238
- question: |
239
-
How can I pull login activity shown in Ranger?
239
+
How can I pull sign-in activity shown in Ranger?
240
240
answer: |
241
241
For auditing requirements, Microsoft recommends enabling Azure Monitor logs as described in [Use Azure Monitor logs to monitor HDInsight clusters](./hdinsight-hadoop-oms-log-analytics-tutorial.md).
242
242
@@ -301,7 +301,7 @@ sections:
301
301
To audit blob storage accounts, configure monitoring using the procedure at [Monitor a storage account in the Azure portal](../storage/common/manage-storage-analytics-logs.md). An HDFS-audit log provides only auditing information for the local HDFS filesystem only (hdfs://mycluster). It doesn't include operations that are done on remote storage.
302
302
303
303
- question: |
304
-
How can I transfer files between a blob container and an HDInsight head node?
304
+
How can I transfer files between a blob container and a HDInsight head node?
305
305
answer: |
306
306
Run a script similar to the following shell script on your head node:
307
307
@@ -394,12 +394,12 @@ sections:
394
394
```
395
395
396
396
> [!NOTE]
397
-
> Curl prompts you for a password. You must enter a valid password for the cluster login username.
397
+
> Curl prompts you for a password. You must enter a valid password for the cluster sign-in username.
398
398
399
399
- name: Billing
400
400
questions:
401
401
- question: |
402
-
How much does it cost to deploy an HDInsight cluster?
402
+
How much does it cost to deploy a HDInsight cluster?
403
403
answer: |
404
404
For more information about pricing and FAQ related to billing, see the [Azure HDInsight Pricing](https://azure.microsoft.com/pricing/details/hdinsight/) page.
405
405
@@ -422,7 +422,7 @@ sections:
422
422
- name: Hive
423
423
questions:
424
424
- question: |
425
-
Why does the Hive version appear as 1.2.1000 instead of 2.1 in the Ambari UI even though I'm running an HDInsight 3.6 cluster?
425
+
Why does the Hive version appear as 1.2.1000 instead of 2.1 in the Ambari UI even though I'm running a HDInsight 3.6 cluster?
426
426
answer: |
427
427
Although only 1.2 appears in the Ambari UI, HDInsight 3.6 contains both Hive 1.2 and Hive 2.1.
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-plan-virtual-network-deployment.md
+11-9Lines changed: 11 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Learn how to plan an Azure Virtual Network deployment to connect HD
4
4
ms.service: azure-hdinsight
5
5
ms.topic: conceptual
6
6
ms.custom: hdinsightactive
7
-
ms.date: 09/06/2024
7
+
ms.date: 09/19/2024
8
8
---
9
9
10
10
# Plan a virtual network for Azure HDInsight
@@ -26,7 +26,7 @@ The following are the questions that you must answer when planning to install HD
26
26
27
27
* Do you need to install HDInsight into an existing virtual network? Or are you creating a new network?
28
28
29
-
If you're using an existing virtual network, you may need to modify the network configuration before you can install HDInsight. For more information, see the [add HDInsight to an existing virtual network](#existingvnet) section.
29
+
If you're using an existing virtual network, you may need to modify the network configuration before you can install HDInsight. For more information, see the [added HDInsight to an existing virtual network](#existingvnet) section.
30
30
31
31
* Do you want to connect the virtual network containing HDInsight to another virtual network or your on-premises network?
32
32
@@ -56,7 +56,7 @@ Use the steps in this section to discover how to add a new HDInsight to an exist
56
56
57
57
As a managed service, HDInsight requires unrestricted access to several IP addresses in the Azure data center. To allow communication with these IP addresses, update any existing network security groups or user-defined routes.
58
58
59
-
HDInsight hosts multiple services, which use a variety of ports. Don't block traffic to these ports. For a list of ports to allow through virtual appliance firewalls, see the Security section.
59
+
HDInsight hosts multiple services, which use various ports. Don't block traffic to these ports. For a list of ports to allow through virtual appliance firewalls, see the Security section.
60
60
61
61
To find your existing security configuration, use the following Azure PowerShell or Azure CLI commands:
62
62
@@ -72,7 +72,7 @@ Use the steps in this section to discover how to add a new HDInsight to an exist
72
72
az network nsg list --resource-group RESOURCEGROUP
73
73
```
74
74
75
-
For more information, see the [Troubleshoot network security groups](../virtual-network/diagnose-network-traffic-filter-problem.md) document.
75
+
For more information, see [Troubleshoot network security groups](../virtual-network/diagnose-network-traffic-filter-problem.md) document.
76
76
77
77
> [!IMPORTANT]
78
78
> Network security group rules are applied in order based on rule priority. The first rule that matches the traffic pattern is applied, and no others are applied for that traffic. Order rules from most permissive to least permissive. For more information, see the [Filter network traffic with network security groups](../virtual-network/network-security-groups-overview.md) document.
@@ -89,9 +89,9 @@ Use the steps in this section to discover how to add a new HDInsight to an exist
89
89
az network route-table list --resource-group RESOURCEGROUP
90
90
```
91
91
92
-
For more information, see the [Troubleshoot routes](../virtual-network/diagnose-network-routing-problem.md) document.
92
+
For more information, see the [Diagnose a virtual machine routing problem](../virtual-network/diagnose-network-routing-problem.md) document.
93
93
94
-
3. Create an HDInsight cluster and select the Azure Virtual Network during configuration. Use the steps in the following documents to understand the cluster creation process:
94
+
3. Create a HDInsight cluster and select the Azure Virtual Network during configuration. Use the steps in the following documents to understand the cluster creation process:
95
95
96
96
* [Create HDInsight using the Azure portal](hdinsight-hadoop-create-linux-clusters-portal.md)
97
97
* [Create HDInsight using Azure PowerShell](hdinsight-hadoop-create-linux-clusters-azure-powershell.md)
@@ -159,7 +159,7 @@ For more information, see the [Name Resolution for VMs and Role Instances](../vi
159
159
160
160
## Directly connect to Apache Hadoop services
161
161
162
-
You can connect to the cluster at `https://CLUSTERNAME.azurehdinsight.net`. This address uses a public IP, which may not be reachable if you have used NSGs to restrict incoming traffic from the internet. Additionally, when you deploy the cluster in a VNet you can access it using the private endpoint `https://CLUSTERNAME-int.azurehdinsight.net`. This endpoint resolves to a private IP inside the VNet for cluster access.
162
+
You can connect to the cluster at `https://CLUSTERNAME.azurehdinsight.net`. This address uses a public IP, which may not be reachable if you have used NSGs to restrict incoming traffic from the internet. Additionally, when you deploy the cluster in a virtual network you can access it using the private endpoint `https://CLUSTERNAME-int.azurehdinsight.net`. This endpoint resolves to a private IP inside the virtual network for cluster access.
163
163
164
164
To connect to Apache Ambari and other web pages through the virtual network, use the following steps:
165
165
@@ -194,9 +194,11 @@ To connect to Apache Ambari and other web pages through the virtual network, use
194
194
195
195
## Load balancing
196
196
197
-
When you create an HDInsight cluster, a load balancer is created as well. The type of this load balanceris at the [basic SKU level](../load-balancer/skus.md), which has certain constraints. One of these constraints is that if you have two virtual networks in different regions, you cannot connect to basic load balancers. See [virtual networks FAQ: constraints on global vnet peering](../virtual-network/virtual-networks-faq.md#what-are-the-constraints-related-to-global-virtual-network-peering-and-load-balancers), for more information.
197
+
When you create a HDInsight cluster, several load balancers are created as well. Due to the [retirement of the basic load balancer](https://azure.microsoft.com/updates/azure-basic-load-balancer-will-be-retired-on-30-september-2025-upgrade-to-standard-load-balancer/), the type of load balancers is at the [standard SKU level](/azure/load-balancer/skus), which has certain constraints. Inbound flows to the standard load balancers are closed unless allowed by a network security group. You may need to bond a network security to your subnet and configure the network security rules.
198
198
199
-
Another constraint is that the HDInsight load balancers should not be deleted or modified. **Any changes to the load balancer rules will get overwritten during certain maintenance events such as certificate renewals.** If the load balancers are modified and it affects the cluster functionality, you may need to recreate the cluster.
199
+
There are [several outbound connectivity methods](/azure/load-balancer/load-balancer-outbound-connections) enabled for the standard load balancer. It’s worth noting that the default outbound access will be retired soon. If a NAT gateway is adopted to provide outbound network access, the subnet is not capable with the basic load balancer. If you intend to bond a NAT gateway to a subnet, there should be no basic load balancer existed in this subnet. With the NAT gateway as the outbound access method, a newly created HDInsight cluster can't share the same subnet with previously created HDInsight clusters with basic load balancers.
200
+
201
+
Another constraint is that the HDInsight load balancers shouldn't be deleted or modified. **Any changes to the load balancer rules will get overwritten during certain maintenance events such as certificate renewals.** If the load balancers are modified and it affects the cluster functionality, you may need to recreate the cluster.
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-restrict-public-connectivity.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ description: Learn how to remove access to all outbound public IP addresses.
4
4
ms.service: azure-hdinsight
5
5
ms.custom: devx-track-azurepowershell
6
6
ms.topic: conceptual
7
-
ms.date: 01/04/2024
7
+
ms.date: 09/19/2024
8
8
---
9
9
10
10
# Restrict public connectivity in Azure HDInsight
@@ -15,7 +15,7 @@ If you want public connectivity between your HDInsight cluster and dependent res
15
15
16
16
The following diagram shows what a potential HDInsight virtual network architecture might look like when `resourceProviderConnection` is set to *outbound*:
17
17
18
-
:::image type="content" source="media/hdinsight-private-link/outbound-resource-provider-connection-only.png" alt-text="Diagram of the HDInsight architecture using an outbound resource provider connection.":::
18
+
:::image type="content" source="./media/hdinsight-restrict-public-connectivity/outbound-resource-provider-connection-only.svg" alt-text="Diagram showing the HDInsight architecture using an outbound resource provider connection." border="true" lightbox="./media/hdinsight-restrict-public-connectivity/outbound-resource-provider-connection-only.svg":::
19
19
20
20
> [!NOTE]
21
21
> Restricting public connectivity is a prerequisite for enabling Private Link and shouldn't be considered the same capability.
0 commit comments