You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-resource-manager/management/relocation/relocation-hdinsight.md
+4-16Lines changed: 4 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,19 +31,8 @@ Before starting the relocation process, ensure the following prerequisites are m
31
31
-**Prepare the target landing zone**: Ensure the target landing zone is ready and matches the assessed architecture.
32
32
-**Document network settings**: Record network configurations, including firewalls and isolation settings.
33
33
-**Identify metastore databases**: List all metastore databases configured in the source cluster.
34
-
-**Review installed applications**: Document installed HDInsight applications and action scripts.
35
-
-**Check availability zone support**: Verify that the target region supports availability zones. HDInsight clusters can currently be created using availability zones in the following regions:
-**Check availability zone support**: Verify that the target region supports availability zones. For more information, see [Region availability](../../../hdinsight/hdinsight-use-availability-zones.md#prerequisites-and-region-availability).
47
36
48
37
## Downtime Considerations
49
38
@@ -75,9 +64,8 @@ Relocate the source storage account to the target region. For detailed steps, se
75
64
Relocate jobs associated with the HDInsight cluster to the target region. Follow the appropriate guidance based on your HDInsight implementation:
76
65
77
66
-**Oozie pipeline/workflow**: Use the Hue import/export method. See [Migrate pipelines using Hue UI](https://gethue.com/exporting-and-importing-oozie-workflows/).
78
-
-**Storm topology**: Transfer Storm event hub spout checkpoint information. See [Transfer Storm event hub spout checkpoint information](../../../hdinsight/storm/apache-troubleshoot-storm#how-do-i-transfer-storm-event-hub-spout-checkpoint-information-from-one-topology-to-another).
79
-
-**HBase workload**: Use backup and replication. See [Backup and replication method](../../../hdinsight/hbase/apache-hbase-backup-replication).
80
-
-**Hive workload & Interactive Query**: Follow the steps in [Migrate Azure HDInsight Hive workloads](../../../hdinsight/interactive-query/apache-hive-migrate-workloads#steps-to-upgrade).
67
+
-**HBase workload**: Use backup and replication. See [Backup and replication method](../../../hdinsight/hbase/apache-hbase-backup-replication.md).
68
+
-**Hive workload & Interactive Query**: Follow the steps in [Migrate Azure HDInsight Hive workloads](../../../hdinsight/interactive-query/apache-hive-migrate-workloads.md#steps-to-upgrade).
81
69
-**Kafka workload**: Use Mirror Maker. See [Mirror Maker](../../../hdinsight/kafka/apache-kafka-mirroring).
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-use-availability-zones.md
+61-69Lines changed: 61 additions & 69 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,115 +9,107 @@ ms.date: 12/03/2024
9
9
10
10
# Create an HDInsight cluster that uses Availability Zones
11
11
12
-
An Azure HDInsight cluster consists of multiple nodes (head nodes, worker nodes, gateway nodes and zookeeper nodes). By default, in a region that supports Availability Zones, the user has no control over which cluster nodes are provisioned in which Availability Zone.
12
+
An Azure HDInsight cluster consists of multiple nodes (head nodes, worker nodes, gateway nodes and zookeeper nodes). By default, in a region that supports Availability Zones, the user has no control over which cluster nodes are provisioned in which Availability Zone.
13
13
14
-
With this new availability zone feature, the user can now specify which Availability Zone should host all the nodes of the HDInsight cluster. The cluster nodes are physically separated from another availability zone and are isolated from failures in other Availability Zones in the same region. This deployment model also provides inexpensive, low latency network connectivity within the cluster.
14
+
With this new availability zone feature, the user can now specify which Availability Zone should host all the nodes of the HDInsight cluster. The cluster nodes are physically separated from another availability zone and are isolated from failures in other Availability Zones in the same region. This deployment model also provides inexpensive, low latency network connectivity within the cluster.
15
15
16
16
Replicating this deployment model into multiple Availability Zones can provide a higher level of availability to protect against hardware failure.
17
17
18
-
This article shows you how to create an HDInsight cluster within an Availability Zone and how to use this feature to achieve higher availability.
18
+
This article shows you how to create an HDInsight cluster within an Availability Zone and how to use this feature to achieve higher availability.
19
19
20
20
## Before you begin
21
+
21
22
Availability Zone feature is only supported for clusters created after June 15. Availability zone settings can't be updated after the cluster is created. You also can't update an existing, non-availability zone cluster to use availability zones.
22
23
23
24
## Prerequisites and region availability
25
+
24
26
Prerequisites:
25
27
26
-
- Clusters must be created under a custom VNet.
27
-
- You need to bring your own SQL DB for Ambari DB and external metastore (like Hive metastore) so that you can config these DBs in the same Availability Zone.
28
+
- Clusters must be created under a custom VNet.
29
+
- You need to bring your own SQL DB for Ambari DB and external metastore (like Hive metastore) so that you can config these DBs in the same Availability Zone.
28
30
29
31
HDInsight clusters can currently be created using availability zones in the following regions:
| South Africa North | Brazil South | Australia East | France Central |
36
+
|| Canada Central | Central India | Germany West Central |
37
+
|| Central US | East Asia | Italy North |
38
+
|| East US | Japan East | North Europe |
39
+
|| East US 2 | Korea Central | Norway East |
40
+
|| Mexico Central | New Zealand North | Poland Central |
41
+
|| South Central US | Qatar Central | Spain Central |
42
+
|| US Gov Virginia | Southeast Asia | Sweden Central |
43
+
|| West US 2 | UAE North | Switzerland North |
44
+
|| West US 3 | Israel Central | UK South |
45
+
|||| West Europe |
46
+
47
+
Let me know if you'd like to group by subregions (e.g., Middle East vs. Asia), or format this for HTML, CSV, or another platform.
48
+
64
49
## Overview of availability zones for HDInsight clusters
65
50
66
51
Availability zones are unique physical locations within a region. Each zone is made up of one or more datacenters equipped with independent power, cooling, and networking. In Azure, a region contains one or more Availability Zones. This physical separation of availability zones within a region protects applications and data from datacenter failures. For more information, see [What are availability zones in Azure](../reliability/availability-zones-overview.md).
67
52
68
-
Azure HDInsight clusters can be configured to deploy within one Availability Zone. All the nodes in this HDInsight cluster including the two head nodes, three zookeeper nodes, two gateway nodes and the worker nodes will be placed in the specified Availability Zone. For example, there are three Availability zones in East US. A HDInsight cluster in East US can be created with all the nodes in Availability zone 1.
53
+
Azure HDInsight clusters can be configured to deploy within one Availability Zone. All the nodes in this HDInsight cluster including the two head nodes, three zookeeper nodes, two gateway nodes and the worker nodes will be placed in the specified Availability Zone. For example, there are three Availability zones in East US. A HDInsight cluster in East US can be created with all the nodes in Availability zone 1.
69
54
70
-
Using Availability zones with HDInsight cluster in this manner can provide both performance and cost benefits:
55
+
Using Availability zones with HDInsight cluster in this manner can provide both performance and cost benefits:
71
56
72
-
- Better performance due to low latency network connectivity
73
-
- Lower cost: data transfer within the same Availability zone is free. Across Availability zone data transfer will incur additional networking cost.
57
+
- Better performance due to low latency network connectivity
58
+
- Lower cost: data transfer within the same Availability zone is free. Across Availability zone data transfer will incur additional networking cost.
74
59
75
-
If your application requires high availability across multiple Availability zones, you can create one primary HDInsight cluster in one Availability zone and create a secondary HDInsight cluster in a different Availability zone with minimum size to save cost. With this design, if one of the other Availability zones goes down, this HDInsight cluster won’t be impacted. If this Availability zone goes down, customers need to switch the secondary clusters in a different Availability zone to the primary, route the workload to this new primary cluster and quickly scale up the cluster size to pick up the data processing.
60
+
If your application requires high availability across multiple Availability zones, you can create one primary HDInsight cluster in one Availability zone and create a secondary HDInsight cluster in a different Availability zone with minimum size to save cost. With this design, if one of the other Availability zones goes down, this HDInsight cluster won’t be impacted. If this Availability zone goes down, customers need to switch the secondary clusters in a different Availability zone to the primary, route the workload to this new primary cluster and quickly scale up the cluster size to pick up the data processing.
76
61
77
62
## Create an HDInsight cluster using availability zone
78
-
You can use Azure Resource Manager (ARM) template to launch an HDInsight cluster into a specified Availability zone.
79
63
80
-
In the resources section, you need to add a section of ‘zones’ and provide which Availability zone you want this cluster to be deployed into.
64
+
You can use Azure Resource Manager (ARM) template to launch an HDInsight cluster into a specified Availability zone.
65
+
66
+
In the resources section, you need to add a section of ‘zones’ and provide which Availability zone you want this cluster to be deployed into.
81
67
82
68
```json
83
-
"resources": [
84
-
{
85
-
"type": "Microsoft.HDInsight/clusters",
86
-
"apiVersion": "2021-06-01",
87
-
"name": "[parameters('cluster name')]",
88
-
"location": "East US 2",
89
-
"zones": [
90
-
"1"
91
-
],
69
+
"resources": [
70
+
{
71
+
"type": "Microsoft.HDInsight/clusters",
72
+
"apiVersion": "2021-06-01",
73
+
"name": "[parameters('cluster name')]",
74
+
"location": "East US 2",
75
+
"zones": [
76
+
"1"
77
+
]
78
+
}
79
+
]
92
80
```
93
-
81
+
94
82
## Verify nodes within one Availability Zone across zones
83
+
95
84
When the HDInsight cluster is ready, you can check the location to see which availability zone they're deployed in.
96
85
97
86
:::image type="content" source="./media/hdinsight-use-availability-zones/cluster-availability-zone-info.png" alt-text="Screenshot shows the availability zone info in cluster overview." border="true":::
98
87
99
-
**Get API response**:
88
+
**Get API response**:
100
89
101
90
```json
102
-
[
103
-
{
104
-
"location": "East US 2",
105
-
"zones": [
106
-
"1"
107
-
],
91
+
[
92
+
{
93
+
"location": "East US 2",
94
+
"zones": [
95
+
"1"
96
+
]
97
+
}
98
+
]
108
99
```
109
100
110
101
## Scale up the cluster
111
102
112
-
You can scale up an HDInsight cluster with more worker nodes. The newly added worker nodes will be placed in the same Availability zone of this cluster.
103
+
You can scale up an HDInsight cluster with more worker nodes. The newly added worker nodes will be placed in the same Availability zone of this cluster.
113
104
114
105
## Best practices
115
106
116
-
- Regularly back up the configurations in Ambari DB.
117
-
- Implement logic to easily route workload to secondary cluster.
107
+
- Regularly back up the configurations in Ambari DB.
108
+
- Implement logic to easily route workload to secondary cluster.
118
109
119
110
## When AZ goes down, what to expect
120
-
- You can't ssh to this cluster
121
-
- You can't delete or scale up or scale down this cluster
122
-
- You can't submit jobs or see job history
123
-
- You still can submit new cluster creation request in a different region
111
+
112
+
- You can't ssh to this cluster
113
+
- You can't delete or scale up or scale down this cluster
114
+
- You can't submit jobs or see job history
115
+
- You still can submit new cluster creation request in a different region
0 commit comments