Skip to content

Commit 3b60a1a

Browse files
authored
Merge pull request #93128 from hrasheed-msft/HA1
HDInsight high availability components
2 parents 0dc62cf + 22149d2 commit 3b60a1a

File tree

5 files changed

+133
-0
lines changed

5 files changed

+133
-0
lines changed

articles/hdinsight/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@
3535
items:
3636
- name: Apache Hadoop components on HDInsight
3737
href: ./hdinsight-component-versioning.md
38+
- name: High availability components
39+
href: ./hdinsight-high-availability-components.md
3840
- name: Machine learning in HDInsight
3941
href: ./hdinsight-machine-learning-overview.md
4042
- name: Streaming at scale in HDInsight
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
---
2+
title: High availability components in Azure HDInsight
3+
description: Overview of the various high availability components used by HDInsight clusters.
4+
author: hrasheed-msft
5+
ms.author: hrasheed
6+
ms.reviewer: jasonh
7+
ms.service: hdinsight
8+
ms.topic: conceptual
9+
ms.date: 11/11/2019
10+
---
11+
# High availability services supported by Azure HDInsight
12+
13+
In order to provide you with optimal levels of availability for your analytics components, HDInsight was developed with a unique architecture for ensuring high availability (HA) of critical services. Some components of this architecture were developed by Microsoft to provide automatic failover. Other components are standard Apache components that are deployed to support specific services. This article explains the architecture of the HA service model in HDInsight, how HDInsight supports failover for HA services, and best practices to recover from other service interruptions.
14+
15+
## High availability infrastructure
16+
17+
HDInsight provides customized infrastructure to ensure that four primary services are high availability with automatic failover capabilities:
18+
19+
- Apache Ambari server
20+
- Application Timeline Server for Apache YARN
21+
- Job History Server for Hadoop MapReduce
22+
- Apache Livy
23+
24+
This infrastructure consists of a number of services and software components, some of which are designed by Microsoft. The following components are unique to the HDInsight platform:
25+
26+
- Slave failover controller
27+
- Master failover controller
28+
- Slave high availability service
29+
- Master high availability service
30+
31+
![high availability infrastructure](./media/hdinsight-high-availability-components/high-availability-architecture.png)
32+
33+
There are also other high availability services, which are supported by open source Apache reliability components. These components are also present on HDInsight clusters:
34+
35+
- Hadoop File System (HDFS) NameNode
36+
- YARN ResourceManager
37+
- HBase Master
38+
39+
The following sections will provide more detail about how these services work together.
40+
41+
## HDInsight high availability services
42+
43+
Microsoft provides support for the four Apache services in the following table in HDInsight clusters. To distinguish them from availability services supported by components from Apache, they are called *HDInsight HA services*.
44+
45+
| Service | Cluster nodes | Cluster types | Purpose |
46+
|---|---|---|---|
47+
| Apache Ambari server| Active headnode | All | Monitors and manages the cluster.|
48+
| Application Timeline Server for Apache YARN | Active headnode | All except Kafka | Maintains debugging info about YARN jobs running on the cluster.|
49+
| Job History Server for Hadoop MapReduce | Active headnode | All except Kafka | Maintains debugging data for MapReduce jobs.|
50+
| Apache Livy | Active headnode | Spark | Enables easy interaction with a Spark cluster over a REST interface |
51+
52+
>[!Note]
53+
> HDInsight Enterprise Security Package (ESP) clusters currently only provide the Ambari server high availability.
54+
55+
### Architecture
56+
57+
Each HDInsight cluster has two headnodes in active and standby modes, respectively. The HDInsight HA services run on headnodes only. These services should always be running on the active headnode, and stopped and put in maintenance mode on the standby headnode.
58+
59+
To maintain the correct states of HA services and provide a fast failover, HDInsight utilizes Apache ZooKeeper, which is a coordination service for distributed applications, to conduct active headnode election. HDInsight also provisions a few background Java processes, which coordinate the failover procedure for HDInsight HA services. These services are the following: the master failover controller, the slave failover controller, the *master-ha-service*, and the *slave-ha-service*.
60+
61+
### Apache ZooKeeper
62+
63+
Apache ZooKeeper is a high-performance coordination service for distributed applications. In production, ZooKeeper usually runs in replicated mode where a replicated group of ZooKeeper servers form a quorum. Each HDInsight cluster has three ZooKeeper nodes that allow three ZooKeeper servers to form a quorum. HDInsight has two ZooKeeper quorums running in parallel with each other. One quorum decides the active headnode in a cluster on which HDInsight HA services should run. Another quorum is used to coordinate HA services provided by Apache, as detailed in later sections.
64+
65+
### Slave failover controller
66+
67+
The slave failover controller runs on every node in an HDInsight cluster. This controller is responsible for starting the Ambari agent and *slave-ha-service* on each node. It periodically queries the first ZooKeeper quorum about the active headnode. When the active and standby headnodes change, the slave failover controller performs the following:
68+
69+
1. Updates the host configuration file.
70+
1. Restarts Ambari agent.
71+
72+
The *slave-ha-service* is responsible for stopping the HDInsight HA services (except Ambari server) on the standby headnode.
73+
74+
### Master failover controller
75+
76+
A master failover controller runs on both headnodes. Both master failover controllers communicate with the first ZooKeeper quorum to nominate the headnode that they're running on as the active headnode.
77+
78+
For example, if the master failover controller on headnode 0 wins the election, the following changes take place:
79+
80+
1. Headnode 0 becomes active.
81+
1. The master failover controller starts Ambari server and the *master-ha-service* on headnode 0.
82+
1. The other master failover controller stops Ambari server and the *master-ha-service* on headnode 1.
83+
84+
The master-ha-service only runs on the active headnode, it stops the HDInsight HA services (except Ambari server) on standby headnode and starts them on active headnode.
85+
86+
### The failover process
87+
88+
![failover process](./media/hdinsight-high-availability-components/failover-steps.png)
89+
90+
A health monitor runs on each headnode along with the master failover controller to send hearbeat notifications to the Zookeeper quorum. The headnode is regarded as an HA service in this scenario. The health monitor checks to see if each high availability service is healthy and if it's ready to join in the leadership election. If yes, this headnode will compete in the election. If not, it will quit the election until it becomes ready again.
91+
92+
If the standby headnode ever achieves leadership and becomes active (such as in the case of a failure with the previous active node), its master failover controller will start all HDInsight HA services on it. The master failover controller will also stop these services on the other headnode.
93+
94+
For HDInsight HA service failures, such as a service being down or unhealthy, the master failover controller should automatically restart or stop the services according to the headnode status. Users shouldn't manually start HDInsight HA services on both head nodes. Instead, allow automatic or manual failover to help the service recover.
95+
96+
### Inadvertent manual intervention
97+
98+
HDInsight HA services should only run on the active headnode, and will be automatically restarted when necessary. Since individual HA services don't have their own health monitor, failover can't be triggered at the level of the individual service. Failover is ensured at the node level and not at the service level.
99+
100+
### Some known issues
101+
102+
- When manually starting an HA service on the standby headnode, it won't stop until next failover happens. When HA services are running on both headnodes, some potential problems include: Ambari UI is inaccessible, Ambari throws errors, YARN, Spark, and Oozie jobs may get stuck.
103+
104+
- When an HA service on the active headnode stops, it won't restart until next failover happens or the master failover controller/master-ha-service restarts. When one or more HA services stop on the active headnode, especially when Ambari server stops, Ambari UI is inaccessible, other potential problems include YARN, Spark, and Oozie jobs failures.
105+
106+
## Apache high availability services
107+
108+
Apache provides high availability for HDFS NameNode, YARN ResourceManager, and HBase Master, which are also available in HDInsight clusters. Unlike HDInsight HA services, they are supported in ESP clusters. Apache HA services communicate with the second ZooKeeper quorum (described in the above section) to elect active/standby states and conduct automatic failover. The following sections detail how these services work.
109+
110+
### Hadoop Distributed File System (HDFS) NameNode
111+
112+
HDInsight clusters based on Apache Hadoop 2.0 or higher provide NameNode high availability. There are two NameNodes running on the headnodes, which are configured for automatic failover. The NameNodes use the *ZKFailoverController* to communicate with Zookeeper to elect for active/standby status. The *ZKFailoverController* runs on both headnodes, and works in the same way as the master failover controller above.
113+
114+
The second Zookeeper quorum is independent of the first quorum, so the active NameNode may not run on the active headnode. When the active NameNode is dead or unhealthy, the standby NameNode wins the election and becomes active.
115+
116+
### YARN ResourceManager
117+
118+
HDInsight clusters based on Apache Hadoop 2.4 or higher, support YARN ResourceManager high availability. There are two ResourceManagers, rm1 and rm2, running on headnode 0 and headnode 1, respectively. Like NameNode, YARN ResourceManager is also configured for automatic failover. Another ResourceManager is automatically elected to be active when the current active ResourceManager goes down or unresponsive.
119+
120+
YARN ResourceManager uses its embedded *ActiveStandbyElector* as a failure detector and leader elector. Unlike HDFS NodeManager, YARN ResourceManager doesn't need a separate ZKFC daemon. The active ResourceManager writes its states into Apache Zookeeper.
121+
122+
The high availability of the YARN ResourceManager is independent from NameNode and other HDInsight HA services. The active ResourceManager may not run on the active headnode or the headnode where the active NameNode is running. For more information about YARN ResourceManager high availability, see [ResourceManager High Availability](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html).
123+
124+
### HBase Master
125+
126+
HDInsight HBase clusters support HBase Master high availability. Unlike other HA services, which run on headnodes, HBase Masters run on the three Zookeeper nodes, where one of them is the active master and the other two are standby. Like NameNode, HBase Master coordinates with Apache Zookeeper for leader election and does automatic failover when the current active master has problems. There is only one active HBase Master at any time.
127+
128+
## Next steps
129+
130+
- [Availability and reliability of Apache Hadoop clusters in HDInsight](hdinsight-high-availability-linux.md)
131+
- [Azure HDInsight virtual network architecture](hdinsight-virtual-network-architecture.md)
14.7 KB
Loading
259 KB
Loading
13.1 KB
Loading

0 commit comments

Comments
 (0)