Skip to content

Commit 3b74c29

Browse files
authored
Merge pull request #112198 from dagiro/freshness_c20
freshness_c20
2 parents d325b53 + f4b5486 commit 3b74c29

8 files changed

+121
-91
lines changed

articles/hdinsight/TOC.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,11 @@
211211
- name: Monitor cluster performance
212212
href: ./hdinsight-key-scenarios-to-monitor.md
213213
- name: Monitor cluster availability with Ambari and Azure Monitor logs
214-
href: ./hdinsight-cluster-availability.md
214+
href: ./hdinsight-cluster-availability.md
215+
- name: Troubleshoot
216+
items:
217+
- name: Troubleshoot script actions
218+
href: ./troubleshoot-script-action.md
215219
- name: Reference
216220
items:
217221
- name: Azure PowerShell

articles/hdinsight/hdinsight-apps-install-custom-applications.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ If an application installation failed, you can see the error messages and debug
114114
115115
* Apache Ambari Web UI: If the install script was the cause of the failure, use Ambari Web UI to check full logs about the install scripts.
116116
117-
For more information, see [Troubleshooting](hdinsight-hadoop-customize-cluster-linux.md#troubleshooting).
117+
For more information, see [Troubleshoot script actions](./troubleshoot-script-action.md).
118118
119119
## Remove HDInsight applications
120120

articles/hdinsight/hdinsight-hadoop-customize-cluster-linux.md

Lines changed: 1 addition & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -327,96 +327,10 @@ Apply a Script Action against a running Linux-based HDInsight cluster](https://g
327327
> [!NOTE]
328328
> This example also demonstrates how to install an HDInsight application by using the .NET SDK.
329329
330-
## Troubleshooting
331-
332-
You can use the Ambari web UI to view information logged by script actions. If the script fails during cluster creation, logs are available in the default cluster storage account. This section provides information on how to retrieve the logs by using both these options.
333-
334-
### The Apache Ambari web UI
335-
336-
1. From a web browser, navigate to `https://CLUSTERNAME.azurehdinsight.net`, where `CLUSTERNAME` is the name of your cluster.
337-
338-
1. From the bar at the top of the page, select the **ops** entry. A list displays current and previous operations done on the cluster through Ambari.
339-
340-
![Ambari web UI bar with ops selected](./media/hdinsight-hadoop-customize-cluster-linux/hdi-apache-ambari-nav.png)
341-
342-
1. Find the entries that have **run\_customscriptaction** in the **Operations** column. These entries are created when the script actions run.
343-
344-
![Apache Ambari script action operations](./media/hdinsight-hadoop-customize-cluster-linux/ambari-script-action.png)
345-
346-
To view the **STDOUT** and **STDERR** output, select the **run\customscriptaction** entry and drill down through the links. This output is generated when the script runs and might have useful information.
347-
348-
### Access logs from the default storage account
349-
350-
If cluster creation fails because of a script error, the logs are kept in the cluster storage account.
351-
352-
* The storage logs are available at `\STORAGE_ACCOUNT_NAME\DEFAULT_CONTAINER_NAME\custom-scriptaction-logs\CLUSTER_NAME\DATE`.
353-
354-
![Script action logs](./media/hdinsight-hadoop-customize-cluster-linux/script-action-logs-in-storage.png)
355-
356-
Under this directory, the logs are organized separately for **headnode**, **worker node**, and **zookeeper node**. See the following examples:
357-
358-
* **Headnode**: `<ACTIVE-HEADNODE-NAME>.cloudapp.net`
359-
360-
* **Worker node**: `<ACTIVE-WORKERNODE-NAME>.cloudapp.net`
361-
362-
* **Zookeeper node**: `<ACTIVE-ZOOKEEPERNODE-NAME>.cloudapp.net`
363-
364-
* All **stdout** and **stderr** of the corresponding host is uploaded to the storage account. There's one **output-\*.txt** and **errors-\*.txt** for each script action. The **output-*.txt** file contains information about the URI of the script that was run on the host. The following text is an example of this information:
365-
366-
'Start downloading script locally: ', u'https://hdiconfigactions.blob.core.windows.net/linuxrconfigactionv01/r-installer-v01.sh'
367-
368-
* It's possible that you repeatedly create a script action cluster with the same name. In that case, you can distinguish the relevant logs based on the **DATE** folder name. For example, the folder structure for a cluster, **mycluster**, created on different dates appears similar to the following log entries:
369-
370-
`\STORAGE_ACCOUNT_NAME\DEFAULT_CONTAINER_NAME\custom-scriptaction-logs\mycluster\2015-10-04`
371-
`\STORAGE_ACCOUNT_NAME\DEFAULT_CONTAINER_NAME\custom-scriptaction-logs\mycluster\2015-10-05`
372-
373-
* If you create a script action cluster with the same name on the same day, you can use the unique prefix to identify the relevant log files.
374-
375-
* If you create a cluster near 12:00 AM, midnight, it's possible that the log files span across two days. In that case, you see two different date folders for the same cluster.
376-
377-
* Uploading log files to the default container can take up to five minutes, especially for large clusters. So if you want to access the logs, you shouldn't immediately delete the cluster if a script action fails.
378-
379-
### Ambari watchdog
380-
381-
> [!WARNING]
382-
> Don't change the password for the Ambari watchdog, hdinsightwatchdog, on your Linux-based HDInsight cluster. Changing the password for this account breaks the ability to run new script actions on the HDInsight cluster.
383-
384-
### Can't import name BlobService
385-
386-
__Symptoms__. The script action fails. Text similar to the following error displays when you view the operation in Ambari:
387-
388-
```
389-
Traceback (most recent call list):
390-
File "/var/lib/ambari-agent/cache/custom_actions/scripts/run_customscriptaction.py", line 21, in <module>
391-
from azure.storage.blob import BlobService
392-
ImportError: cannot import name BlobService
393-
```
394-
395-
__Cause__. This error occurs if you upgrade the Python Azure Storage client that's included with the HDInsight cluster. HDInsight expects Azure Storage client 0.20.0.
396-
397-
__Resolution__. To resolve this error, manually connect to each cluster node by using `ssh`. Run the following command to reinstall the correct storage client version:
398-
399-
```bash
400-
sudo pip install azure-storage==0.20.0
401-
```
402-
403-
For information on connecting to the cluster with SSH, see [Connect to HDInsight (Apache Hadoop) by using SSH](hdinsight-hadoop-linux-use-ssh-unix.md).
404-
405-
### History doesn't show the scripts used during cluster creation
406-
407-
If your cluster was created before March 15, 2016, you might not see an entry in script action history. Resizing the cluster causes the scripts to appear in script action history.
408-
409-
There are two exceptions:
410-
411-
* Your cluster was created before September 1, 2015. This date is when script actions were introduced. Any cluster created before this date couldn't have used script actions for cluster creation.
412-
413-
* You used multiple script actions during cluster creation. Or you used the same name for multiple scripts or the same name, same URI, but different parameters for multiple scripts. In these cases, you get the following error:
414-
415-
No new script actions can be run on this cluster because of conflicting script names in existing scripts. Script names provided at cluster creation must be all unique. Existing scripts are run on resize.
416-
417330
## Next steps
418331
419332
* [Develop script action scripts for HDInsight](hdinsight-hadoop-script-actions-linux.md)
420333
* [Add additional storage to an HDInsight cluster](hdinsight-hadoop-add-storage.md)
334+
* [Troubleshoot script actions](troubleshoot-script-action.md)
421335
422336
[img-hdi-cluster-states]: ./media/hdinsight-hadoop-customize-cluster-linux/cluster-provisioning-states.png "Stages during cluster creation"

articles/hdinsight/hdinsight-hadoop-script-actions-linux.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ In this example, the `hdfs` command transparently uses the default cluster stora
154154
HDInsight logs script output that is written to STDOUT and STDERR. You can view this information using the Ambari web UI.
155155

156156
> [!NOTE]
157-
> Apache Ambari is only available if the cluster is successfully created. If you use a script action during cluster creation, and creation fails, see the troubleshooting section [Customize HDInsight clusters using script action](hdinsight-hadoop-customize-cluster-linux.md#troubleshooting) for other ways of accessing logged information.
157+
> Apache Ambari is only available if the cluster is successfully created. If you use a script action during cluster creation, and creation fails, see [Troubleshoot script actions](./troubleshoot-script-action.md) for other ways of accessing logged information.
158158
159159
Most utilities and installation packages already write information to STDOUT and STDERR, however you may want to add additional logging. To send text to STDOUT, use `echo`. For example:
160160

@@ -170,7 +170,7 @@ By default, `echo` sends the string to STDOUT. To direct it to STDERR, add `>&2`
170170

171171
This redirects information written to STDOUT to STDERR (2) instead. For more information on IO redirection, see [https://www.tldp.org/LDP/abs/html/io-redirection.html](https://www.tldp.org/LDP/abs/html/io-redirection.html).
172172

173-
For more information on viewing information logged by script actions, see [Customize HDInsight clusters using script action](hdinsight-hadoop-customize-cluster-linux.md#troubleshooting)
173+
For more information on viewing information logged by script actions, see [Troubleshoot script actions](./troubleshoot-script-action.md).
174174

175175
### <a name="bps8"></a> Save files as ASCII with LF line endings
176176

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
---
2+
title: Troubleshoot script actions in Azure HDInsight
3+
description: General troubleshooting steps for script actions in Azure HDInsight.
4+
author: hrasheed-msft
5+
ms.author: hrasheed
6+
ms.reviewer: jasonh
7+
ms.service: hdinsight
8+
ms.topic: troubleshooting
9+
ms.date: 04/21/2020
10+
---
11+
12+
# Troubleshoot script actions in Azure HDInsight
13+
14+
This article describes troubleshooting steps and possible resolutions for issues when interacting with Azure HDInsight clusters.
15+
16+
## Viewing logs
17+
18+
You can use the Apache Ambari web UI to view information logged by script actions. If the script fails during cluster creation, logs are in the default cluster storage account. This section provides information on how to retrieve the logs by using both these options.
19+
20+
### Apache Ambari web UI
21+
22+
1. From a web browser, navigate to `https://CLUSTERNAME.azurehdinsight.net`, where `CLUSTERNAME` is the name of your cluster.
23+
24+
1. From the bar at the top of the page, select the **ops** entry. A list displays current and previous operations done on the cluster through Ambari.
25+
26+
![Ambari web UI bar with ops selected](./media/troubleshoot-script-action/hdi-apache-ambari-nav.png)
27+
28+
1. Find the entries that have **run\_customscriptaction** in the **Operations** column. These entries are created when the script actions run.
29+
30+
![Apache Ambari script action operations](./media/troubleshoot-script-action/ambari-script-action.png)
31+
32+
To view the **STDOUT** and **STDERR** output, select the **run\customscriptaction** entry and drill down through the links. This output is generated when the script runs and might have useful information.
33+
34+
### Default storage account
35+
36+
If cluster creation fails because of a script error, the logs are kept in the cluster storage account.
37+
38+
* The storage logs are available at `\STORAGE_ACCOUNT_NAME\DEFAULT_CONTAINER_NAME\custom-scriptaction-logs\CLUSTER_NAME\DATE`.
39+
40+
![Script action logs](./media/troubleshoot-script-action/script-action-logs-in-storage.png)
41+
42+
Under this directory, the logs are organized separately for **headnode**, **worker node**, and **zookeeper node**. See the following examples:
43+
44+
* **Headnode**: `<ACTIVE-HEADNODE-NAME>.cloudapp.net`
45+
46+
* **Worker node**: `<ACTIVE-WORKERNODE-NAME>.cloudapp.net`
47+
48+
* **Zookeeper node**: `<ACTIVE-ZOOKEEPERNODE-NAME>.cloudapp.net`
49+
50+
* All **stdout** and **stderr** of the corresponding host is uploaded to the storage account. There's one **output-\*.txt** and **errors-\*.txt** for each script action. The **output-*.txt** file contains information about the URI of the script that was run on the host. The following text is an example of this information:
51+
52+
'Start downloading script locally: ', u'https://hdiconfigactions.blob.core.windows.net/linuxrconfigactionv01/r-installer-v01.sh'
53+
54+
* It's possible that you repeatedly create a script action cluster with the same name. In that case, you can distinguish the relevant logs based on the **DATE** folder name. For example, the folder structure for a cluster, **mycluster**, created on different dates appears similar to the following log entries:
55+
56+
`\STORAGE_ACCOUNT_NAME\DEFAULT_CONTAINER_NAME\custom-scriptaction-logs\mycluster\2015-10-04`
57+
`\STORAGE_ACCOUNT_NAME\DEFAULT_CONTAINER_NAME\custom-scriptaction-logs\mycluster\2015-10-05`
58+
59+
* If you create a script action cluster with the same name on the same day, you can use the unique prefix to identify the relevant log files.
60+
61+
* If you create a cluster near 12:00 AM, midnight, it's possible that the log files span across two days. In that case, you see two different date folders for the same cluster.
62+
63+
* Uploading log files to the default container can take up to five minutes, especially for large clusters. So if you want to access the logs, you shouldn't immediately delete the cluster if a script action fails.
64+
65+
## Ambari watchdog
66+
67+
Don't change the password for the Ambari watchdog, hdinsightwatchdog, on your Linux-based HDInsight cluster. A password change breaks the ability to run new script actions on the HDInsight cluster.
68+
69+
## Can't import name BlobService
70+
71+
__Symptoms__. The script action fails. Text similar to the following error displays when you view the operation in Ambari:
72+
73+
```
74+
Traceback (most recent call list):
75+
File "/var/lib/ambari-agent/cache/custom_actions/scripts/run_customscriptaction.py", line 21, in <module>
76+
from azure.storage.blob import BlobService
77+
ImportError: cannot import name BlobService
78+
```
79+
80+
__Cause__. This error occurs if you upgrade the Python Azure Storage client that's included with the HDInsight cluster. HDInsight expects Azure Storage client 0.20.0.
81+
82+
__Resolution__. To resolve this error, manually connect to each cluster node by using `ssh`. Run the following command to reinstall the correct storage client version:
83+
84+
```bash
85+
sudo pip install azure-storage==0.20.0
86+
```
87+
88+
For information on connecting to the cluster with SSH, see [Connect to HDInsight (Apache Hadoop) by using SSH](hdinsight-hadoop-linux-use-ssh-unix.md).
89+
90+
## History doesn't show the scripts used during cluster creation
91+
92+
If your cluster was created before March 15, 2016, you might not see an entry in script action history. Resizing the cluster causes the scripts to appear in script action history.
93+
94+
There are two exceptions:
95+
96+
* Your cluster was created before September 1, 2015. This date is when script actions were introduced. Any cluster created before this date couldn't have used script actions for cluster creation.
97+
98+
* You used multiple script actions during cluster creation. Or you used the same name for multiple scripts or the same name, same URI, but different parameters for multiple scripts. In these cases, you get the following error:
99+
100+
```
101+
No new script actions can be run on this cluster because of conflicting script names in existing scripts. Script names provided at cluster creation must be all unique. Existing scripts are run on resize.
102+
```
103+
104+
## Next steps
105+
106+
If you didn't see your problem or are unable to solve your issue, visit one of the following channels for more support:
107+
108+
* Get answers from Azure experts through [Azure Community Support](https://azure.microsoft.com/support/community/).
109+
110+
* Connect with [@AzureSupport](https://twitter.com/azuresupport) - the official Microsoft Azure account for improving customer experience. Connecting the Azure community to the right resources: answers, support, and experts.
111+
112+
* If you need more help, you can submit a support request from the [Azure portal](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade/). Select **Support** from the menu bar or open the **Help + support** hub. For more detailed information, review [How to create an Azure support request](https://docs.microsoft.com/azure/azure-portal/supportability/how-to-create-azure-support-request). Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the [Azure Support Plans](https://azure.microsoft.com/support/plans/).

0 commit comments

Comments
 (0)