Skip to content

Commit f38a8e0

Browse files
authored
Update hdinsight-hadoop-architecture.md
1 parent 17bae5c commit f38a8e0

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

articles/hdinsight/hdinsight-hadoop-architecture.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.date: 01/19/2018
1212
---
1313
# Hadoop architecture in HDInsight
1414

15-
Hadoop includes two core components, the Hadoop Distributed File System (HDFS) that provides storage, and Yet Another Resource Negotiator (YARN) that provides processing. With storage and processing capabilities a cluster becomes capable of running MapReduce programs to perform the desired data processing.
15+
Hadoop includes two core components: the Hadoop Distributed File System (HDFS) that provides storage, and Yet Another Resource Negotiator (YARN) that provides processing. With storage and processing capabilities, a cluster becomes capable of running MapReduce programs to perform the desired data processing.
1616

1717
> [!NOTE]
1818
> An HDFS is not typically deployed within the HDInsight cluster to provide storage. Instead, an HDFS-compatible interface layer is used by Hadoop components. The actual storage capability is provided by either Azure Storage or Azure Data Lake Store. For Hadoop, MapReduce jobs executing on the HDInsight cluster run as if an HDFS were present and so require no changes to support their storage needs. In Hadoop on HDInsight, storage is outsourced, but YARN processing remains a core component. For more information, see [Introduction to Azure HDInsight](hadoop/apache-hadoop-introduction.md).
@@ -26,7 +26,7 @@ YARN governs and orchestrates data processing in Hadoop. YARN has two core serv
2626
* ResourceManager
2727
* NodeManager
2828

29-
The ResourceManager grants cluster compute resources to applications like MapReduce jobs. The ResourceManager grants these resources as containers, where each container consists of an allocation of CPU cores and RAM memory. If you combined all the resources available in a cluster and then distributed them in blocks of a given number of cores and memory, each block of resources is a container. Each node in the cluster has a capacity for a certain number of containers, and therefore the cluster has a fixed limit on the number of containers available. The allotment of resources in a container is configurable.
29+
The ResourceManager grants cluster compute resources to applications like MapReduce jobs. The ResourceManager grants these resources as containers, where each container consists of an allocation of CPU cores and RAM memory. If you combined all the resources available in a cluster and then distributed them in blocks of a given number of cores and memory, each block of resources is a container. Each node in the cluster has a capacity for a certain number of containers, therefore the cluster has a fixed limit on the number of containers available. The allotment of resources in a container is configurable.
3030

3131
When a MapReduce application runs on a cluster, the ResourceManager provides the application the containers in which to execute. The ResourceManager tracks the status of running applications, available cluster capacity, and tracks applications as they complete and release their resources.
3232

@@ -38,7 +38,7 @@ The NodeManagers run the tasks that make up the application, then report their p
3838

3939
## YARN on HDInsight
4040

41-
All HDInsight cluster types deploy YARN. The ResourceManager is deployed for high availability with a primary and secondary instance, which run on the first and second head nodes within the cluster respectively. Only the one instance of the ResourceManager is active at a time. The NodeManager instances run across the available worker nodes in the cluster.
41+
All HDInsight cluster types deploy YARN. The ResourceManager is deployed for high availability with a primary and secondary instance, which runs on the first and second head nodes within the cluster respectively. Only the one instance of the ResourceManager is active at a time. The NodeManager instances run across the available worker nodes in the cluster.
4242

4343
![YARN on HDInsight](./media/hdinsight-hadoop-architecture/yarn-on-hdinsight.png)
4444

0 commit comments

Comments
 (0)