Skip to content

Commit f37d07b

Browse files
committed
cats164
1 parent 88215cf commit f37d07b

File tree

1 file changed

+33
-25
lines changed

1 file changed

+33
-25
lines changed

articles/hdinsight/spark/apache-spark-intellij-tool-failure-debug.md

Lines changed: 33 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -2,19 +2,21 @@
22
title: 'Failure spark job debugging with Azure Toolkit for IntelliJ (preview) '
33
description: Guidance using HDInsight Tools in Azure Toolkit for IntelliJ to debug applications
44
keywords: debug remotely intellij, remote debugging intellij, ssh, intellij, hdinsight, debug intellij, debugging
5-
ms.service: hdinsight
65
author: hrasheed-msft
76
ms.author: hrasheed
8-
ms.reviewer: jasonh
7+
ms.reviewer: jasonh
8+
ms.service: hdinsight
99
ms.custom: hdinsightactive,hdiseo17may2017
1010
ms.topic: conceptual
1111
ms.date: 07/12/2019
1212
---
13+
1314
# Failure spark job debugging with Azure Toolkit for IntelliJ (preview)
1415

15-
This article provides step-by-step guidance on how to use HDInsight Tools in [Azure Toolkit for IntelliJ](https://docs.microsoft.com/java/azure/intellij/azure-toolkit-for-intellij?view=azure-java-stable) to run **Spark Failure Debug** applications.
16+
This article provides step-by-step guidance on how to use HDInsight Tools in [Azure Toolkit for IntelliJ](https://docs.microsoft.com/java/azure/intellij/azure-toolkit-for-intellij?view=azure-java-stable) to run **Spark Failure Debug** applications.
1617

1718
## Prerequisites
19+
1820
* [Oracle Java Development kit](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html). This tutorial uses Java version 8.0.202.
1921

2022
* IntelliJ IDEA. This article uses [IntelliJ IDEA Community ver. 2019.1.3](https://www.jetbrains.com/idea/download/#section=windows).
@@ -25,23 +27,23 @@ This article provides step-by-step guidance on how to use HDInsight Tools in [Az
2527

2628
* Microsoft Azure Storage Explorer. See [Download Microsoft Azure Storage Explorer](https://azure.microsoft.com/features/storage-explorer/).
2729

28-
## Create a project with debugging template
30+
## Create a project with debugging template
2931

3032
Create a spark2.3.2 project to continue failure debug, take failure task​ debugging sample file in this document.
3133

3234
1. Open IntelliJ IDEA. Open the **New Project** window.
3335

34-
a. Select **Azure Spark/HDInsight** from the left pane.
36+
a. Select **Azure Spark/HDInsight** from the left pane.
3537

3638
b. Select **Spark Project with Failure Task Debugging Sample(Preview)(Scala)** from the main window.
3739

38-
![Create a debug project](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-projectfor-failure-debug.png)
40+
![Intellij Create a debug project](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-projectfor-failure-debug.png)
41+
42+
c. Select **Next**.
3943

40-
c. Select **Next**.
41-
4244
2. In the **New Project** window, do the following steps:
4345

44-
![Select the Spark SDK](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-new-project.png)
46+
![Intellij New Project select Spark version](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-new-project.png)
4547

4648
a. Enter a project name and project location.
4749

@@ -59,72 +61,77 @@ Create a spark Scala​/Java application, then run the application on a Spark cl
5961

6062
1. Click **Add Configuration** to open **Run/Debug Configurations** window.
6163

62-
![Edit configurations](./media/apache-spark-intellij-tool-failure-debug/hdinsight-add-new-configuration.png)
64+
![HDI Intellij Add configuration](./media/apache-spark-intellij-tool-failure-debug/hdinsight-add-new-configuration.png)
6365

6466
2. In the **Run/Debug Configurations** dialog box, select the plus sign (**+**). Then select the **Apache Spark on HDInsight** option.
6567

66-
![Add new configuration](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-new-configuraion-01.png)
68+
![Intellij Add new configuration](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-new-configuraion-01.png)
6769

6870
3. Switch to **Remotely Run in Cluster** tab. Enter information for **Name**, **Spark cluster**, and **Main class name**. Our tools support debug with **Executors**. The **numExectors**, the default value is 5, and you'd better not set higher than 3. To reduce the run time, you can add **spark.yarn.maxAppAttempts** into **job Configurations** and set the value to 1. Click **OK** button to save the configuration.
6971

70-
![Run debug configurations](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-new-configuraion-002.png)
72+
![Intellij Run debug configurations new](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-new-configuraion-002.png)
7173

72-
4. The configuration is now saved with the name you provided. To view the configuration details, select the configuration name. To make changes, select **Edit Configurations**.
74+
4. The configuration is now saved with the name you provided. To view the configuration details, select the configuration name. To make changes, select **Edit Configurations**.
7375

7476
5. After you complete the configurations settings, you can run the project against the remote cluster.
75-
76-
![Remote run button](./media/apache-spark-intellij-tool-failure-debug/hdinsight-local-run-configuration.png)
77+
78+
![Intellij Debug Remote Spark Job Remote run button](./media/apache-spark-intellij-tool-failure-debug/hdinsight-local-run-configuration.png)
7779

7880
6. You can check the application ID from the output window.
79-
80-
![Remote run result](./media/apache-spark-intellij-tool-failure-debug/hdinsight-remotely-run-result.png)
81+
82+
![Intellij Debug Remote Spark Job Remote run result](./media/apache-spark-intellij-tool-failure-debug/hdinsight-remotely-run-result.png)
8183

8284
## Download failed job profile
8385

8486
​If the job submission fails, you could download the failed job profile to the local machine for further debugging.
8587

8688
1. Open **Microsoft Azure Storage Explorer**, locate the HDInsight account of the cluster for the failed job, download the failed job resources from the corresponding location: **\hdp\spark2-events\\.spark-failures\\\<application ID>** to a local folder.​ The **activities** window will show the download progress.
8789

88-
![download failure file1](./media/apache-spark-intellij-tool-failure-debug/hdinsight-find-spark-file-001.png)
90+
![Azure Storage Explorer download failure](./media/apache-spark-intellij-tool-failure-debug/hdinsight-find-spark-file-001.png)
8991

90-
![download failure file2](./media/apache-spark-intellij-tool-failure-debug/spark-on-cosmos-doenload-file-2.png)
92+
![Azure Storage Explorer download success](./media/apache-spark-intellij-tool-failure-debug/spark-on-cosmos-doenload-file-2.png)
9193

9294
## Configure local debugging environment and debug on failure​​
9395

9496
1. Open the original project​ or create a new project and associate it with the original source code​.​ Only spark2.3.2 version is supported for failure debugging currently.
9597

96-
2. In IntelliJ IDEA, create a **Spark Failure Debug** config file, select the FTD file from the previously downloaded failed job resources for the **Spark Job Failure Context location** field.
97-
98+
1. In IntelliJ IDEA, create a **Spark Failure Debug** config file, select the FTD file from the previously downloaded failed job resources for the **Spark Job Failure Context location** field.
99+
98100
![crete failure configuration](./media/apache-spark-intellij-tool-failure-debug/hdinsight-create-failure-configuration-01.png)
99101

100-
4. Click the local run button in the toolbar, the error will display in Run window.
101-
102+
1. Click the local run button in the toolbar, the error will display in Run window.
103+
102104
![run-failure-configuration1](./media/apache-spark-intellij-tool-failure-debug/local-run-failure-configuraion-01.png)
103105

104106
![run-failure-configuration2](./media/apache-spark-intellij-tool-failure-debug/local-run-failure-configuration.png)
105107

106-
5. Set break point as the log indicates, then click local debug button to do local debugging just as your normal Scala / Java projects in IntelliJ.
108+
1. Set break point as the log indicates, then click local debug button to do local debugging just as your normal Scala / Java projects in IntelliJ.
107109

108-
5. After debugging, ​if the project completes successfully​​​, ​you could resubmit the failed job to your spark on HDInsight cluster.
110+
1. After debugging, ​if the project completes successfully​​​, ​you could resubmit the failed job to your spark on HDInsight cluster.
109111

110112
## <a name="seealso"></a>Next steps
113+
111114
* [Overview: Debug Apache Spark applications](apache-spark-intellij-tool-debug-remotely-through-ssh.md)
112115

113116
### Demo
117+
114118
* Create Scala project (video): [Create Apache Spark Scala Applications](https://channel9.msdn.com/Series/AzureDataLake/Create-Spark-Applications-with-the-Azure-Toolkit-for-IntelliJ)
115119
* Remote debug (video): [Use Azure Toolkit for IntelliJ to debug Apache Spark applications remotely on an HDInsight cluster](https://channel9.msdn.com/Series/AzureDataLake/Debug-HDInsight-Spark-Applications-with-Azure-Toolkit-for-IntelliJ)
116120

117121
### Scenarios
122+
118123
* [Apache Spark with BI: Do interactive data analysis by using Spark in HDInsight with BI tools](apache-spark-use-bi-tools.md)
119124
* [Apache Spark with Machine Learning: Use Spark in HDInsight to analyze building temperature using HVAC data](apache-spark-ipython-notebook-machine-learning.md)
120125
* [Apache Spark with Machine Learning: Use Spark in HDInsight to predict food inspection results](apache-spark-machine-learning-mllib-ipython.md)
121126
* [Website log analysis using Apache Spark in HDInsight](../hdinsight-apache-spark-custom-library-website-log-analysis.md)
122127

123128
### Create and run applications
129+
124130
* [Create a standalone application using Scala](../hdinsight-apache-spark-create-standalone-application.md)
125131
* [Run jobs remotely on an Apache Spark cluster using Apache Livy](apache-spark-livy-rest-interface.md)
126132

127133
### Tools and extensions
134+
128135
* [Use Azure Toolkit for IntelliJ to create Apache Spark applications for an HDInsight cluster](apache-spark-intellij-tool-plugin.md)
129136
* [Use Azure Toolkit for IntelliJ to debug Apache Spark applications remotely through VPN](apache-spark-intellij-tool-plugin-debug-jobs-remotely.md)
130137
* [Use HDInsight Tools for IntelliJ with Hortonworks Sandbox](../hadoop/hdinsight-tools-for-intellij-with-hortonworks-sandbox.md)
@@ -135,5 +142,6 @@ Create a spark Scala​/Java application, then run the application on a Spark cl
135142
* [Install Jupyter on your computer and connect to an HDInsight Spark cluster](apache-spark-jupyter-notebook-install-locally.md)
136143

137144
### Manage resources
145+
138146
* [Manage resources for the Apache Spark cluster in Azure HDInsight](apache-spark-resource-manager.md)
139147
* [Track and debug jobs running on an Apache Spark cluster in HDInsight](apache-spark-job-debugging.md)

0 commit comments

Comments
 (0)