You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: Learn techniques for controlling inbound and outbound traffic to Azure HDInsight clusters.
4
+
author: hrasheed-msft
5
+
ms.author: hrasheed
6
+
ms.reviewer: jasonh
7
+
ms.service: hdinsight
8
+
ms.topic: conceptual
9
+
ms.date: 05/04/2020
10
+
---
11
+
12
+
# Control network traffic in Azure HDInsight
13
+
14
+
Network traffic in an Azure Virtual Networks can be controlled using the following methods:
15
+
16
+
***Network security groups** (NSG) allow you to filter inbound and outbound traffic to the network. For more information, see the [Filter network traffic with network security groups](../virtual-network/security-overview.md) document.
17
+
18
+
***Network virtual appliances** (NVA) can be used with outbound traffic only. NVAs replicate the functionality of devices such as firewalls and routers. For more information, see the [Network Appliances](https://azure.microsoft.com/solutions/network-appliances) document.
19
+
20
+
As a managed service, HDInsight requires unrestricted access to the HDInsight health and management services both for incoming and outgoing traffic from the VNET. When using NSGs, you must ensure that these services can still communicate with HDInsight cluster.
21
+
22
+

23
+
24
+
## HDInsight with network security groups
25
+
26
+
If you plan on using **network security groups** to control network traffic, perform the following actions before installing HDInsight:
27
+
28
+
1. Identify the Azure region that you plan to use for HDInsight.
29
+
30
+
2. Identify the service tags required by HDInsight for your region. For more information, see [Network security group (NSG) service tags for Azure HDInsight](hdinsight-service-tags.md).
31
+
32
+
3. Create or modify the network security groups for the subnet that you plan to install HDInsight into.
33
+
34
+
*__Network security groups__: allow __inbound__ traffic on port __443__ from the IP addresses. This will ensure that HDInsight management services can reach the cluster from outside the virtual network.
35
+
36
+
For more information on network security groups, see the [overview of network security groups](../virtual-network/security-overview.md).
37
+
38
+
## Controlling outbound traffic from HDInsight clusters
39
+
40
+
For more information on controlling outbound traffic from HDInsight clusters, see [Configure outbound network traffic restriction for Azure HDInsight clusters](hdinsight-restrict-outbound-traffic.md).
41
+
42
+
### Forced tunneling to on-premises
43
+
44
+
Forced tunneling is a user-defined routing configuration where all traffic from a subnet is forced to a specific network or location, such as your on-premises network or Firewall. Forced tunneling of all data transfer back to on-premise is _not_ recommended due to large volumes of data transfer and potential performance impact.
45
+
46
+
Customers who are interested to setup forced tunneling, should use [custom metastores](./hdinsight-use-external-metadata-stores.md) and setup the appropriate connectivity from the cluster subnet or on-premise network to these custom metastores.
47
+
48
+
To see an example of the UDR setup with Azure Firewall, see [Configure outbound network traffic restriction for Azure HDInsight clusters](hdinsight-restrict-outbound-traffic.md).
49
+
50
+
## Required IP addresses
51
+
52
+
If you use network security groups or user-defined routes to control traffic, see [HDInsight management IP addresses](hdinsight-management-ip-addresses.md).
53
+
54
+
## Required ports
55
+
56
+
If you plan on using a **firewall** and access the cluster from outside on certain ports, you might need to allow traffic on those ports needed for your scenario. By default, no special whitelisting of ports is needed as long as the Azure management traffic explained in the previous section is allowed to reach cluster on port 443.
57
+
58
+
For a list of ports for specific services, see the [Ports used by Apache Hadoop services on HDInsight](hdinsight-hadoop-port-settings-for-services.md) document.
59
+
60
+
For more information on firewall rules for virtual appliances, see the [virtual appliance scenario](../virtual-network/virtual-network-scenario-udr-gw-nva.md) document.
61
+
62
+
## Next steps
63
+
64
+
* For code samples and examples of creating Azure Virtual Networks, see [Create virtual networks for Azure HDInsight clusters](hdinsight-create-virtual-network.md).
65
+
* For an end-to-end example of configuring HDInsight to connect to an on-premises network, see [Connect HDInsight to an on-premises network](./connect-on-premises-network.md).
66
+
* For more information on Azure virtual networks, see the [Azure Virtual Network overview](../virtual-network/virtual-networks-overview.md).
67
+
* For more information on network security groups, see [Network security groups](../virtual-network/security-overview.md).
68
+
* For more information on user-defined routes, see [User-defined routes and IP forwarding](../virtual-network/virtual-networks-udr-overview.md).
69
+
* For more information on virtual networks, see [Plan VNETs for HDInsight](./hdinsight-plan-virtual-network-deployment.md).
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-on-premises-migration-best-practices-infrastructure.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -104,7 +104,7 @@ For more information, see the following articles:
104
104
105
105
## Customize HDInsight configs using Bootstrap
106
106
107
-
Changes to configs in the config files such as `core-site.xml`, `hive-site.xml` and `oozie-env.xml` can be made using Bootstrap. The following script is an example using the Powershell[AZ module](https://docs.microsoft.com/powershell/azure/new-azureps-module-az) cmdlet [New-AzHDInsightClusterConfig](https://docs.microsoft.com/powershell/module/az.hdinsight/new-azhdinsightcluster):
107
+
Changes to configs in the config files such as `core-site.xml`, `hive-site.xml` and `oozie-env.xml` can be made using Bootstrap. The following script is an example using the PowerShell[AZ module](https://docs.microsoft.com/powershell/azure/new-azureps-module-az) cmdlet [New-AzHDInsightClusterConfig](https://docs.microsoft.com/powershell/module/az.hdinsight/new-azhdinsightcluster):
108
108
109
109
```powershell
110
110
# hive-site.xml configuration
@@ -163,7 +163,7 @@ Using Azure Virtual Network with HDInsight enables the following scenarios:
163
163
- Directly accessing Hadoop services that aren't available publicly over the internet. For example, Kafka APIs or the HBase Java API.
164
164
165
165
HDInsight can either be added to a new or existing Azure Virtual Network. If HDInsight is being added to an existing Virtual Network, the existing network security groups and user-defined routes need to be updated to allow unrestricted access to [several IP addresses](../hdinsight-management-ip-addresses.md)
166
-
in the Azure data center. Also, make sure not to block traffic to the [ports](../hdinsight-plan-virtual-network-deployment.md#hdinsight-ports), which are being used by HDInsight services.
166
+
in the Azure data center. Also, make sure not to block traffic to the [ports](../control-network-traffic.md#required-ports), which are being used by HDInsight services.
167
167
168
168
> [!Note]
169
169
> HDInsight does not currently support forced tunneling. Forced tunneling is a subnet setting that forces outbound Internet traffic to a device for inspection and logging. Either remove forced tunneling before installing HDInsight into a subnet or create a new subnet for HDInsight. HDInsight also does not support restricting outbound network connectivity.
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/hdinsight-troubleshoot-invalidnetworksecuritygroupsecurityrules-cluster-creation-fails.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ Likely an issue with the inbound [network security group](../../virtual-network/
23
23
24
24
## Resolution
25
25
26
-
Go to the Azure portal and identify the NSG that is associated with the subnet where the cluster is being deployed. In the **Inbound security rules** section, make sure the rules allow inbound access to port 443 for the IP addresses mentioned [here](../hdinsight-plan-virtual-network-deployment.md#hdinsight-ip).
26
+
Go to the Azure portal and identify the NSG that is associated with the subnet where the cluster is being deployed. In the **Inbound security rules** section, make sure the rules allow inbound access to port 443 for the IP addresses mentioned [here](../control-network-traffic.md).
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/troubleshoot-invalidnetworkconfigurationerrorcode-cluster-creation-fails.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,7 +49,7 @@ Error description contains "Failed to connect to Azure Storage Account” or “
49
49
50
50
### Cause
51
51
52
-
Azure Storage and SQL don't have fixed IP Addresses, so we need to allow outbound connections to all IPs to allow accessing these services. The exact resolution steps depend on whether you have set up a Network Security Group (NSG) or User-Defined Rules (UDR). Refer to the section on [controlling network traffic with HDInsight with network security groups and user-defined routes](../hdinsight-plan-virtual-network-deployment.md#hdinsight-ip) for details on these configurations.
52
+
Azure Storage and SQL don't have fixed IP Addresses, so we need to allow outbound connections to all IPs to allow accessing these services. The exact resolution steps depend on whether you have set up a Network Security Group (NSG) or User-Defined Rules (UDR). Refer to the section on [controlling network traffic with HDInsight with network security groups and user-defined routes](../control-network-traffic.md) for details on these configurations.
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-faq.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -128,7 +128,7 @@ Yes, you can deploy an additional virtual machine within the same subnet as an H
128
128
129
129
- Edge nodes: You can add another edge node to the cluster, as described in [Use empty edge nodes on Apache Hadoop clusters in HDInsight](hdinsight-apps-use-edge-node.md).
130
130
131
-
- Standalone nodes: You can add a standalone virtual machine to the same subnet and access the cluster from that virtual machine by using the private end point `https://<CLUSTERNAME>-int.azurehdinsight.net`. For more information, see [Controlling network traffic](hdinsight-plan-virtual-network-deployment.md#networktraffic).
131
+
- Standalone nodes: You can add a standalone virtual machine to the same subnet and access the cluster from that virtual machine by using the private end point `https://<CLUSTERNAME>-int.azurehdinsight.net`. For more information, see [Control network traffic](./control-network-traffic.md).
132
132
133
133
### Should I store data on the local disk of an edge node?
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-management-ip-addresses.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,7 +77,7 @@ Allow traffic from the IP addresses listed for the Azure HDInsight health and ma
77
77
78
78
For information on the IP addresses to use for Azure Government, see the [Azure Government Intelligence + Analytics](https://docs.microsoft.com/azure/azure-government/documentation-government-services-intelligenceandanalytics) document.
79
79
80
-
For more information, see the [Controlling network traffic](hdinsight-plan-virtual-network-deployment.md#networktraffic) section.
80
+
For more information, see [Control network traffic](./control-network-traffic.md).
81
81
82
82
If you're using user-defined routes (UDRs), you should specify a route and allow outbound traffic from the virtual network to the above IPs with the next hop set to "Internet".
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-plan-virtual-network-deployment.md
+3-54Lines changed: 3 additions & 54 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
8
ms.topic: conceptual
9
9
ms.custom: hdinsightactive,seoapr2020
10
-
ms.date: 04/21/2020
10
+
ms.date: 05/04/2020
11
11
---
12
12
13
13
# Plan a virtual network for Azure HDInsight
@@ -39,7 +39,7 @@ The following are the questions that you must answer when planning to install HD
39
39
40
40
* Do you want to restrict/redirect inbound or outbound traffic to HDInsight?
41
41
42
-
HDInsight must have unrestricted communication with specific IP addresses in the Azure data center. There are also several ports that must be allowed through firewalls for client communication. For more information, see the [controlling network traffic](#networktraffic) section.
42
+
HDInsight must have unrestricted communication with specific IP addresses in the Azure data center. There are also several ports that must be allowed through firewalls for client communication. For more information, see [Control network traffic](./control-network-traffic.md).
43
43
44
44
## <aid="existingvnet"></a>Add HDInsight to an existing virtual network
45
45
@@ -196,58 +196,6 @@ To connect to Apache Ambari and other web pages through the virtual network, use
196
196
197
197
2. To determine the node and port that a service is available on, see the [Ports used by Hadoop services on HDInsight](./hdinsight-hadoop-port-settings-for-services.md) document.
### Techniques for controlling inbound and outbound traffic to HDInsight clusters
202
-
203
-
Network traffic in an Azure Virtual Networks can be controlled using the following methods:
204
-
205
-
* **Network security groups** (NSG) allow you to filter inbound and outbound traffic to the network. For more information, see the [Filter network traffic with network security groups](../virtual-network/security-overview.md) document.
206
-
207
-
* **Network virtual appliances** (NVA) can be used with outbound traffic only. NVAs replicate the functionality of devices such as firewalls and routers. For more information, see the [Network Appliances](https://azure.microsoft.com/solutions/network-appliances) document.
208
-
209
-
As a managed service, HDInsight requires unrestricted access to the HDInsight health and management services both for incoming and outgoing traffic from the VNET. When using NSGs, you must ensure that these services can still communicate with HDInsight cluster.
210
-
211
-

212
-
213
-
### HDInsight with network security groups
214
-
215
-
If you plan on using **network security groups** to control network traffic, perform the following actions before installing HDInsight:
216
-
217
-
1. Identify the Azure region that you plan to use for HDInsight.
218
-
219
-
2. Identify the service tags required by HDInsight for your region. For more information, see [Network security group (NSG) service tags for Azure HDInsight](hdinsight-service-tags.md).
220
-
221
-
3. Create or modify the network security groups for the subnet that you plan to install HDInsight into.
222
-
223
-
* __Network security groups__: allow __inbound__ traffic on port __443__ from the IP addresses. This will ensure that HDInsight management services can reach the cluster from outside the virtual network.
224
-
225
-
For more information on network security groups, see the [overview of network security groups](../virtual-network/security-overview.md).
226
-
227
-
### Controlling outbound traffic from HDInsight clusters
228
-
229
-
For more information on controlling outbound traffic from HDInsight clusters, see [Configure outbound network traffic restriction for Azure HDInsight clusters](hdinsight-restrict-outbound-traffic.md).
230
-
231
-
#### Forced tunneling to on-premises
232
-
233
-
Forced tunneling is a user-defined routing configuration where all traffic from a subnet is forced to a specific network or location, such as your on-premises network or Firewall. Forced tunneling of all data transfer back to on-premise is _not_ recommended due to large volumes of data transfer and potential performance impact.
234
-
235
-
Customers who are interested to setup forced tunneling, should use [custom metastores](./hdinsight-use-external-metadata-stores.md) and setup the approperiate connectivity from the cluster subnet or on-premise network to these custom metastores.
236
-
237
-
To see an example of the UDR setup with Azure Firewall, see [Configure outbound network traffic restriction for Azure HDInsight clusters](hdinsight-restrict-outbound-traffic.md).
238
-
239
-
## <a id="hdinsight-ip"></a> Required IP addresses
240
-
241
-
If you use network security groups or user-defined routes to control traffic, see [HDInsight management IP addresses](hdinsight-management-ip-addresses.md).
242
-
243
-
## <a id="hdinsight-ports"></a> Required ports
244
-
245
-
If you plan on using a **firewall** and access the cluster from outside on certain ports, you might need to allow traffic on those ports needed for your scenario. By default, no special whitelisting of ports is needed as long as the Azure management traffic explained in the previous section is allowed to reach cluster on port 443.
246
-
247
-
For a list of ports for specific services, see the [Ports used by Apache Hadoop services on HDInsight](hdinsight-hadoop-port-settings-for-services.md) document.
248
-
249
-
For more information on firewall rules for virtual appliances, see the [virtual appliance scenario](../virtual-network/virtual-network-scenario-udr-gw-nva.md) document.
250
-
251
199
## Load balancing
252
200
253
201
When you create an HDInsight cluster, a load balancer is created as well. The type of this load balancer is at the [basic SKU level](../load-balancer/types.md#skus), which has certain constraints. One of these constraints is that if you have two virtual networks in different regions, you cannot connect to basic load balancers. See [virtual networks FAQ: constraints on global vnet peering](../virtual-network/virtual-networks-faq.md#what-are-the-constraints-related-to-global-vnet-peering-and-load-balancers), for more information.
@@ -259,3 +207,4 @@ When you create an HDInsight cluster, a load balancer is created as well. The ty
259
207
* For more information on Azure virtual networks, see the [Azure Virtual Network overview](../virtual-network/virtual-networks-overview.md).
260
208
* For more information on network security groups, see [Network security groups](../virtual-network/security-overview.md).
261
209
* For more information on user-defined routes, see [User-defined routes and IP forwarding](../virtual-network/virtual-networks-udr-overview.md).
210
+
* For more information on controlling traffic, see [Control network traffic](./control-network-traffic.md).
0 commit comments