Skip to content

Commit dfc6772

Browse files
authored
Merge pull request #113745 from dagiro/freshness_c65
freshness_c65
2 parents be0a69a + 881b0b2 commit dfc6772

10 files changed

+81
-61
lines changed

articles/hdinsight/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -907,6 +907,8 @@
907907
href: ./transport-layer-security.md
908908
- name: Plan VNETs for HDInsight
909909
href: ./hdinsight-plan-virtual-network-deployment.md
910+
- name: Control network traffic
911+
href: ./control-network-traffic.md
910912
- name: Required IP Addresses for NSGs and UDRs
911913
href: ./hdinsight-management-ip-addresses.md
912914
- name: Service tags for Azure firewall
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
title: Control network traffic in Azure HDInsight
3+
description: Learn techniques for controlling inbound and outbound traffic to Azure HDInsight clusters.
4+
author: hrasheed-msft
5+
ms.author: hrasheed
6+
ms.reviewer: jasonh
7+
ms.service: hdinsight
8+
ms.topic: conceptual
9+
ms.date: 05/04/2020
10+
---
11+
12+
# Control network traffic in Azure HDInsight
13+
14+
Network traffic in an Azure Virtual Networks can be controlled using the following methods:
15+
16+
* **Network security groups** (NSG) allow you to filter inbound and outbound traffic to the network. For more information, see the [Filter network traffic with network security groups](../virtual-network/security-overview.md) document.
17+
18+
* **Network virtual appliances** (NVA) can be used with outbound traffic only. NVAs replicate the functionality of devices such as firewalls and routers. For more information, see the [Network Appliances](https://azure.microsoft.com/solutions/network-appliances) document.
19+
20+
As a managed service, HDInsight requires unrestricted access to the HDInsight health and management services both for incoming and outgoing traffic from the VNET. When using NSGs, you must ensure that these services can still communicate with HDInsight cluster.
21+
22+
![Diagram of HDInsight entities created in Azure custom VNET](./media/control-network-traffic/hdinsight-vnet-diagram.png)
23+
24+
## HDInsight with network security groups
25+
26+
If you plan on using **network security groups** to control network traffic, perform the following actions before installing HDInsight:
27+
28+
1. Identify the Azure region that you plan to use for HDInsight.
29+
30+
2. Identify the service tags required by HDInsight for your region. For more information, see [Network security group (NSG) service tags for Azure HDInsight](hdinsight-service-tags.md).
31+
32+
3. Create or modify the network security groups for the subnet that you plan to install HDInsight into.
33+
34+
* __Network security groups__: allow __inbound__ traffic on port __443__ from the IP addresses. This will ensure that HDInsight management services can reach the cluster from outside the virtual network.
35+
36+
For more information on network security groups, see the [overview of network security groups](../virtual-network/security-overview.md).
37+
38+
## Controlling outbound traffic from HDInsight clusters
39+
40+
For more information on controlling outbound traffic from HDInsight clusters, see [Configure outbound network traffic restriction for Azure HDInsight clusters](hdinsight-restrict-outbound-traffic.md).
41+
42+
### Forced tunneling to on-premises
43+
44+
Forced tunneling is a user-defined routing configuration where all traffic from a subnet is forced to a specific network or location, such as your on-premises network or Firewall. Forced tunneling of all data transfer back to on-premise is _not_ recommended due to large volumes of data transfer and potential performance impact.
45+
46+
Customers who are interested to setup forced tunneling, should use [custom metastores](./hdinsight-use-external-metadata-stores.md) and setup the appropriate connectivity from the cluster subnet or on-premise network to these custom metastores.
47+
48+
To see an example of the UDR setup with Azure Firewall, see [Configure outbound network traffic restriction for Azure HDInsight clusters](hdinsight-restrict-outbound-traffic.md).
49+
50+
## Required IP addresses
51+
52+
If you use network security groups or user-defined routes to control traffic, see [HDInsight management IP addresses](hdinsight-management-ip-addresses.md).
53+
54+
## Required ports
55+
56+
If you plan on using a **firewall** and access the cluster from outside on certain ports, you might need to allow traffic on those ports needed for your scenario. By default, no special whitelisting of ports is needed as long as the Azure management traffic explained in the previous section is allowed to reach cluster on port 443.
57+
58+
For a list of ports for specific services, see the [Ports used by Apache Hadoop services on HDInsight](hdinsight-hadoop-port-settings-for-services.md) document.
59+
60+
For more information on firewall rules for virtual appliances, see the [virtual appliance scenario](../virtual-network/virtual-network-scenario-udr-gw-nva.md) document.
61+
62+
## Next steps
63+
64+
* For code samples and examples of creating Azure Virtual Networks, see [Create virtual networks for Azure HDInsight clusters](hdinsight-create-virtual-network.md).
65+
* For an end-to-end example of configuring HDInsight to connect to an on-premises network, see [Connect HDInsight to an on-premises network](./connect-on-premises-network.md).
66+
* For more information on Azure virtual networks, see the [Azure Virtual Network overview](../virtual-network/virtual-networks-overview.md).
67+
* For more information on network security groups, see [Network security groups](../virtual-network/security-overview.md).
68+
* For more information on user-defined routes, see [User-defined routes and IP forwarding](../virtual-network/virtual-networks-udr-overview.md).
69+
* For more information on virtual networks, see [Plan VNETs for HDInsight](./hdinsight-plan-virtual-network-deployment.md).

articles/hdinsight/domain-joined/hdinsight-security-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ The following table provides links to resources for each type of security soluti
8383
| Operating system security | Create clusters with most recent secure base image | Customer |
8484
| | Ensure [OS Patching](../hdinsight-os-patching.md) on regular intervals | Customer |
8585
| Network security | Configure a [virtual network](../hdinsight-plan-virtual-network-deployment.md) |
86-
| | Configure [Inbound network security group (NSG) rules](../hdinsight-plan-virtual-network-deployment.md#networktraffic) | Customer |
86+
| | Configure [Inbound network security group (NSG) rules](../control-network-traffic.md) | Customer |
8787
| | Configure [Outbound traffic restriction](../hdinsight-restrict-outbound-traffic.md) with Firewall | Customer |
8888
| Virtualized infrastructure | N/A | HDInsight (Cloud provider) |
8989
| Physical infrastructure security | N/A | HDInsight (cloud provider) |

articles/hdinsight/hadoop/apache-hadoop-on-premises-migration-best-practices-infrastructure.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ For more information, see the following articles:
104104

105105
## Customize HDInsight configs using Bootstrap
106106

107-
Changes to configs in the config files such as `core-site.xml`, `hive-site.xml` and `oozie-env.xml` can be made using Bootstrap. The following script is an example using the Powershell [AZ module](https://docs.microsoft.com/powershell/azure/new-azureps-module-az) cmdlet [New-AzHDInsightClusterConfig](https://docs.microsoft.com/powershell/module/az.hdinsight/new-azhdinsightcluster):
107+
Changes to configs in the config files such as `core-site.xml`, `hive-site.xml` and `oozie-env.xml` can be made using Bootstrap. The following script is an example using the PowerShell [AZ module](https://docs.microsoft.com/powershell/azure/new-azureps-module-az) cmdlet [New-AzHDInsightClusterConfig](https://docs.microsoft.com/powershell/module/az.hdinsight/new-azhdinsightcluster):
108108

109109
```powershell
110110
# hive-site.xml configuration
@@ -163,7 +163,7 @@ Using Azure Virtual Network with HDInsight enables the following scenarios:
163163
- Directly accessing Hadoop services that aren't available publicly over the internet. For example, Kafka APIs or the HBase Java API.
164164

165165
HDInsight can either be added to a new or existing Azure Virtual Network. If HDInsight is being added to an existing Virtual Network, the existing network security groups and user-defined routes need to be updated to allow unrestricted access to [several IP addresses](../hdinsight-management-ip-addresses.md)
166-
in the Azure data center. Also, make sure not to block traffic to the [ports](../hdinsight-plan-virtual-network-deployment.md#hdinsight-ports), which are being used by HDInsight services.
166+
in the Azure data center. Also, make sure not to block traffic to the [ports](../control-network-traffic.md#required-ports), which are being used by HDInsight services.
167167

168168
> [!Note]
169169
> HDInsight does not currently support forced tunneling. Forced tunneling is a subnet setting that forces outbound Internet traffic to a device for inspection and logging. Either remove forced tunneling before installing HDInsight into a subnet or create a new subnet for HDInsight. HDInsight also does not support restricting outbound network connectivity.

articles/hdinsight/hadoop/hdinsight-troubleshoot-invalidnetworksecuritygroupsecurityrules-cluster-creation-fails.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Likely an issue with the inbound [network security group](../../virtual-network/
2323

2424
## Resolution
2525

26-
Go to the Azure portal and identify the NSG that is associated with the subnet where the cluster is being deployed. In the **Inbound security rules** section, make sure the rules allow inbound access to port 443 for the IP addresses mentioned [here](../hdinsight-plan-virtual-network-deployment.md#hdinsight-ip).
26+
Go to the Azure portal and identify the NSG that is associated with the subnet where the cluster is being deployed. In the **Inbound security rules** section, make sure the rules allow inbound access to port 443 for the IP addresses mentioned [here](../control-network-traffic.md).
2727

2828
## Next steps
2929

articles/hdinsight/hadoop/troubleshoot-invalidnetworkconfigurationerrorcode-cluster-creation-fails.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Error description contains "Failed to connect to Azure Storage Account” or “
4949

5050
### Cause
5151

52-
Azure Storage and SQL don't have fixed IP Addresses, so we need to allow outbound connections to all IPs to allow accessing these services. The exact resolution steps depend on whether you have set up a Network Security Group (NSG) or User-Defined Rules (UDR). Refer to the section on [controlling network traffic with HDInsight with network security groups and user-defined routes](../hdinsight-plan-virtual-network-deployment.md#hdinsight-ip) for details on these configurations.
52+
Azure Storage and SQL don't have fixed IP Addresses, so we need to allow outbound connections to all IPs to allow accessing these services. The exact resolution steps depend on whether you have set up a Network Security Group (NSG) or User-Defined Rules (UDR). Refer to the section on [controlling network traffic with HDInsight with network security groups and user-defined routes](../control-network-traffic.md) for details on these configurations.
5353

5454
### Resolution
5555

articles/hdinsight/hdinsight-faq.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ Yes, you can deploy an additional virtual machine within the same subnet as an H
128128

129129
- Edge nodes: You can add another edge node to the cluster, as described in [Use empty edge nodes on Apache Hadoop clusters in HDInsight](hdinsight-apps-use-edge-node.md).
130130

131-
- Standalone nodes: You can add a standalone virtual machine to the same subnet and access the cluster from that virtual machine by using the private end point `https://<CLUSTERNAME>-int.azurehdinsight.net`. For more information, see [Controlling network traffic](hdinsight-plan-virtual-network-deployment.md#networktraffic).
131+
- Standalone nodes: You can add a standalone virtual machine to the same subnet and access the cluster from that virtual machine by using the private end point `https://<CLUSTERNAME>-int.azurehdinsight.net`. For more information, see [Control network traffic](./control-network-traffic.md).
132132

133133
### Should I store data on the local disk of an edge node?
134134

articles/hdinsight/hdinsight-management-ip-addresses.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ Allow traffic from the IP addresses listed for the Azure HDInsight health and ma
7777

7878
For information on the IP addresses to use for Azure Government, see the [Azure Government Intelligence + Analytics](https://docs.microsoft.com/azure/azure-government/documentation-government-services-intelligenceandanalytics) document.
7979

80-
For more information, see the [Controlling network traffic](hdinsight-plan-virtual-network-deployment.md#networktraffic) section.
80+
For more information, see [Control network traffic](./control-network-traffic.md).
8181

8282
If you're using user-defined routes (UDRs), you should specify a route and allow outbound traffic from the virtual network to the above IPs with the next hop set to "Internet".
8383

articles/hdinsight/hdinsight-plan-virtual-network-deployment.md

Lines changed: 3 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.reviewer: jasonh
77
ms.service: hdinsight
88
ms.topic: conceptual
99
ms.custom: hdinsightactive,seoapr2020
10-
ms.date: 04/21/2020
10+
ms.date: 05/04/2020
1111
---
1212

1313
# Plan a virtual network for Azure HDInsight
@@ -39,7 +39,7 @@ The following are the questions that you must answer when planning to install HD
3939

4040
* Do you want to restrict/redirect inbound or outbound traffic to HDInsight?
4141

42-
HDInsight must have unrestricted communication with specific IP addresses in the Azure data center. There are also several ports that must be allowed through firewalls for client communication. For more information, see the [controlling network traffic](#networktraffic) section.
42+
HDInsight must have unrestricted communication with specific IP addresses in the Azure data center. There are also several ports that must be allowed through firewalls for client communication. For more information, see [Control network traffic](./control-network-traffic.md).
4343

4444
## <a id="existingvnet"></a>Add HDInsight to an existing virtual network
4545

@@ -196,58 +196,6 @@ To connect to Apache Ambari and other web pages through the virtual network, use
196196
197197
2. To determine the node and port that a service is available on, see the [Ports used by Hadoop services on HDInsight](./hdinsight-hadoop-port-settings-for-services.md) document.
198198
199-
## <a id="networktraffic"></a> Controlling network traffic
200-
201-
### Techniques for controlling inbound and outbound traffic to HDInsight clusters
202-
203-
Network traffic in an Azure Virtual Networks can be controlled using the following methods:
204-
205-
* **Network security groups** (NSG) allow you to filter inbound and outbound traffic to the network. For more information, see the [Filter network traffic with network security groups](../virtual-network/security-overview.md) document.
206-
207-
* **Network virtual appliances** (NVA) can be used with outbound traffic only. NVAs replicate the functionality of devices such as firewalls and routers. For more information, see the [Network Appliances](https://azure.microsoft.com/solutions/network-appliances) document.
208-
209-
As a managed service, HDInsight requires unrestricted access to the HDInsight health and management services both for incoming and outgoing traffic from the VNET. When using NSGs, you must ensure that these services can still communicate with HDInsight cluster.
210-
211-
![Diagram of HDInsight entities created in Azure custom VNET](./media/hdinsight-plan-virtual-network-deployment/hdinsight-vnet-diagram.png)
212-
213-
### HDInsight with network security groups
214-
215-
If you plan on using **network security groups** to control network traffic, perform the following actions before installing HDInsight:
216-
217-
1. Identify the Azure region that you plan to use for HDInsight.
218-
219-
2. Identify the service tags required by HDInsight for your region. For more information, see [Network security group (NSG) service tags for Azure HDInsight](hdinsight-service-tags.md).
220-
221-
3. Create or modify the network security groups for the subnet that you plan to install HDInsight into.
222-
223-
* __Network security groups__: allow __inbound__ traffic on port __443__ from the IP addresses. This will ensure that HDInsight management services can reach the cluster from outside the virtual network.
224-
225-
For more information on network security groups, see the [overview of network security groups](../virtual-network/security-overview.md).
226-
227-
### Controlling outbound traffic from HDInsight clusters
228-
229-
For more information on controlling outbound traffic from HDInsight clusters, see [Configure outbound network traffic restriction for Azure HDInsight clusters](hdinsight-restrict-outbound-traffic.md).
230-
231-
#### Forced tunneling to on-premises
232-
233-
Forced tunneling is a user-defined routing configuration where all traffic from a subnet is forced to a specific network or location, such as your on-premises network or Firewall. Forced tunneling of all data transfer back to on-premise is _not_ recommended due to large volumes of data transfer and potential performance impact.
234-
235-
Customers who are interested to setup forced tunneling, should use [custom metastores](./hdinsight-use-external-metadata-stores.md) and setup the approperiate connectivity from the cluster subnet or on-premise network to these custom metastores.
236-
237-
To see an example of the UDR setup with Azure Firewall, see [Configure outbound network traffic restriction for Azure HDInsight clusters](hdinsight-restrict-outbound-traffic.md).
238-
239-
## <a id="hdinsight-ip"></a> Required IP addresses
240-
241-
If you use network security groups or user-defined routes to control traffic, see [HDInsight management IP addresses](hdinsight-management-ip-addresses.md).
242-
243-
## <a id="hdinsight-ports"></a> Required ports
244-
245-
If you plan on using a **firewall** and access the cluster from outside on certain ports, you might need to allow traffic on those ports needed for your scenario. By default, no special whitelisting of ports is needed as long as the Azure management traffic explained in the previous section is allowed to reach cluster on port 443.
246-
247-
For a list of ports for specific services, see the [Ports used by Apache Hadoop services on HDInsight](hdinsight-hadoop-port-settings-for-services.md) document.
248-
249-
For more information on firewall rules for virtual appliances, see the [virtual appliance scenario](../virtual-network/virtual-network-scenario-udr-gw-nva.md) document.
250-
251199
## Load balancing
252200
253201
When you create an HDInsight cluster, a load balancer is created as well. The type of this load balancer is at the [basic SKU level](../load-balancer/types.md#skus), which has certain constraints. One of these constraints is that if you have two virtual networks in different regions, you cannot connect to basic load balancers. See [virtual networks FAQ: constraints on global vnet peering](../virtual-network/virtual-networks-faq.md#what-are-the-constraints-related-to-global-vnet-peering-and-load-balancers), for more information.
@@ -259,3 +207,4 @@ When you create an HDInsight cluster, a load balancer is created as well. The ty
259207
* For more information on Azure virtual networks, see the [Azure Virtual Network overview](../virtual-network/virtual-networks-overview.md).
260208
* For more information on network security groups, see [Network security groups](../virtual-network/security-overview.md).
261209
* For more information on user-defined routes, see [User-defined routes and IP forwarding](../virtual-network/virtual-networks-udr-overview.md).
210+
* For more information on controlling traffic, see [Control network traffic](./control-network-traffic.md).

0 commit comments

Comments
 (0)