Skip to content

Commit 407a06c

Browse files
authored
Merge pull request #89017 from dagiro/cats138
cats138
2 parents 61d616b + 2efcd16 commit 407a06c

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

articles/hdinsight/hdinsight-troubleshoot-failed-cluster.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ Each HDInsight cluster relies on various Azure services, and on open-source soft
7676
Apache Ambari provides management and monitoring of a HDInsight cluster with a web UI and a REST API.
7777
Ambari is included on Linux-based HDInsight clusters. Select the **Cluster Dashboard** pane on the Azure portal HDInsight page. Select the **HDInsight cluster dashboard** pane to open the Ambari UI, and enter the cluster login credentials.
7878

79-
![Ambari UI](./media/hdinsight-troubleshoot-failed-cluster/apache-ambari-overview.png)
79+
![Apache Ambari dashboard overview](./media/hdinsight-troubleshoot-failed-cluster/apache-ambari-overview.png)
8080

8181
To open a list of service views, select **Ambari Views** on the Azure portal page. This list depends on which libraries are installed. For example, you may see YARN Queue Manager, Hive View, and Tez View. Select a service link to see configuration and service information.
8282

@@ -124,7 +124,7 @@ curl -u admin:{HTTP PASSWD} https://{CLUSTERNAME}.azurehdinsight.net/templeton/v
124124

125125
Ambari displays an alert showing the hosts on which the WebHCat service is down. You can try to bring the WebHCat service back up by restarting the service on its host.
126126

127-
![Restart WebHCat Server](./media/hdinsight-troubleshoot-failed-cluster/restart-webhcat-server.png)
127+
![Apache Ambari Restart WebHCat Server](./media/hdinsight-troubleshoot-failed-cluster/restart-webhcat-server.png)
128128

129129
If a WebHCat server still does not come up, then check the operations log for failure messages. For more detailed information, check the `stderr` and `stdout` files referenced on the node.
130130

@@ -173,7 +173,7 @@ At the YARN level, there are two types of timeouts:
173173

174174
The following image shows the joblauncher queue at 714.4% overused. This is acceptable so long as there is still free capacity in the default queue to borrow from. However, when the cluster is fully utilized and the YARN memory is at 100% capacity, new jobs must wait, which eventually causes timeouts.
175175

176-
![Joblauncher queue](./media/hdinsight-troubleshoot-failed-cluster/hdi-job-launcher-queue.png)
176+
![HDInsight Job launcher queue view](./media/hdinsight-troubleshoot-failed-cluster/hdi-job-launcher-queue.png)
177177

178178
There are two ways to resolve this issue: either reduce the speed of new jobs being submitted, or increase the consumption speed of old jobs by scaling up the cluster.
179179

@@ -205,7 +205,7 @@ To diagnose these issues:
205205

206206
The Ambari UI **Stack and Version** page provides information about cluster services configuration and service version history. Incorrect Hadoop service library versions can be a cause of cluster failure. In the Ambari UI, select the **Admin** menu and then **Stacks and Versions**. Select the **Versions** tab on the page to see service version information:
207207

208-
![Stack and Versions](./media/hdinsight-troubleshoot-failed-cluster/ambari-stack-versions.png)
208+
![Apache Ambari Stack and Versions](./media/hdinsight-troubleshoot-failed-cluster/ambari-stack-versions.png)
209209

210210
## Step 5: Examine the log files
211211

0 commit comments

Comments
 (0)