You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/spark/apache-spark-intellij-tool-debug-remotely-through-ssh.md
+36-30Lines changed: 36 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,10 +2,10 @@
2
2
title: 'Azure Toolkit for IntelliJ: Debug Spark applications remotely through SSH '
3
3
description: Step-by-step guidance on how to use HDInsight Tools in Azure Toolkit for IntelliJ to debug applications remotely on HDInsight clusters through SSH
@@ -41,17 +41,17 @@ This article provides step-by-step guidance on how to use HDInsight Tools in [Az
41
41
42
42
c. In the **Build tool** list, select either of the following, according to your need:
43
43
44
-
-**Maven**, for Scala project-creation wizard support
44
+
-**Maven**, for Scala project-creation wizard support.
45
45
46
-
-**SBT**, for managing the dependencies and building for the Scala project
46
+
-**SBT**, for managing the dependencies and building for the Scala project.
47
47
48
-

48
+

49
+
50
+
d. Select **Next**.
49
51
50
-
d. Select **Next**.
51
-
52
52
1. In the next **New Project** window, do the following:
53
53
54
-

54
+

55
55
56
56
a. Enter a project name and project location.
57
57
@@ -64,90 +64,96 @@ This article provides step-by-step guidance on how to use HDInsight Tools in [Az
64
64
1. Select **src** > **main** > **scala** to open your code in the project. This example uses the **SparkCore_wasbloTest** script.
65
65
66
66
### Prerequisite for Windows
67
-
While you're running the local Spark Scala application on a Windows computer, you might get an exception, as explained in [SPARK-2356](https://issues.apache.org/jira/browse/SPARK-2356). The exception occurs because WinUtils.exe is missing on Windows.
67
+
While you're running the local Spark Scala application on a Windows computer, you might get an exception, as explained in [SPARK-2356](https://issues.apache.org/jira/browse/SPARK-2356). The exception occurs because WinUtils.exe is missing on Windows.
68
68
69
69
To resolve this error, [download the executable](https://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe) to a location such as **C:\WinUtils\bin**. Then, add the environment variable **HADOOP_HOME**, and set the value of the variable to **C:\WinUtils**.
70
70
71
71
### Scenario 2: Perform local run
72
+
72
73
1. Open the **SparkCore_wasbloTest** script, right-click the script editor, and then select the option **Run '[Spark Job]XXX'** to perform local run.
74
+
73
75
1. Once local run completed, you can see the output file save to your current project explorer **data** > **__default__**.
74
76
75
-

77
+

78
+
76
79
1. Our tools have set the default local run configuration automatically when you perform the local run and local debug. Open the configuration **[Spark on HDInsight] XXX** on the upper right corner, you can see the **[Spark on HDInsight]XXX** already created under **Apache Spark on HDInsight**. Switch to **Locally Run** tab.
77
80
78
-

81
+

82
+
79
83
-[Environment variables](#prerequisite-for-windows): If you already set the system environment variable **HADOOP_HOME** to **C:\WinUtils**, it can auto detect that no need to manually add.
80
84
-[WinUtils.exe Location](#prerequisite-for-windows): If you have not set the system environment variable, you can find the location by clicking its button.
81
85
- Just choose either of two options and, they are not needed on MacOS and Linux.
86
+
82
87
1. You can also set the configuration manually before performing local run and local debug. In the preceding screenshot, select the plus sign (**+**). Then select the **Apache Spark on HDInsight** option. Enter information for **Name**, **Main class name** to save, then click the local run button.
83
88
84
89
### Scenario 3: Perform local debugging
85
90
1. Open the **SparkCore_wasbloTest** script, set breakpoints.
86
-
1. Right-click the script editor, and then select the option **Debug '[Spark on HDInsight]XXX'** to perform local debugging.
87
-
88
-
91
+
1. Right-click the script editor, and then select the option **Debug '[Spark on HDInsight]XXX'** to perform local debugging.
89
92
90
93
## Learn how to perform remote run and debugging
91
94
### Scenario 1: Perform remote run
92
95
93
96
1. To access the **Edit Configurations** menu, select the icon in the upper-right corner. From this menu, you can create or edit the configurations for remote debugging.
1. In the **Run/Debug Configurations** dialog box, select the plus sign (**+**). Then select the **Apache Spark on HDInsight** option.
98
101
99
-

102
+

103
+
100
104
1. Switch to **Remotely Run in Cluster** tab. Enter information for **Name**, **Spark cluster**, and **Main class name**. Then Click **Advanced configuration (Remote Debugging)**. Our tools support debug with **Executors**. The **numExectors**, the default value is 5. You'd better not set higher than 3.

103
107
104
108
1. In the **Advanced Configuration (Remote Debugging)** part, select **Enable Spark remote debug**. Enter the SSH username, and then enter a password or use a private key file. If you want to perform remote debug, you need to set it. There is no need to set it if you just want to use remote run.
1. The configuration is now saved with the name you provided. To view the configuration details, select the configuration name. To make changes, select **Edit Configurations**.
112
+
1. The configuration is now saved with the name you provided. To view the configuration details, select the configuration name. To make changes, select **Edit Configurations**.
109
113
110
114
1. After you complete the configurations settings, you can run the project against the remote cluster or perform remote debugging.
111
-
112
-

115
+
116
+

113
117
114
118
1. Click the **Disconnect** button that the submission logs not appear in the left panel. However, it is still running on the backend.
115
119
116
-

120
+

117
121
118
122
### Scenario 2: Perform remote debugging
119
123
1. Set up breaking points, and then Click the **Remote debug** icon. The difference with remote submission is that SSH username/password need to be configured.
120
124
121
-

1. When the program execution reaches the breaking point, you see a **Driver** tab and two **Executor** tabs in the **Debugger** pane. Select the **Resume Program** icon to continue running the code, which then reaches the next breakpoint. You need to switch to the correct **Executor** tab to find the target executor to debug. You can view the execution logs on the corresponding **Console** tab.
1. To dynamically update the variable value by using the IntelliJ debugging capability, select **Debug** again. The **Variables** pane appears again.
145
+
1. To dynamically update the variable value by using the IntelliJ debugging capability, select **Debug** again. The **Variables** pane appears again.
141
146
142
-
1. Right-click the target on the **Debug** tab, and then select **Set Value**. Next, enter a new value for the variable. Then select **Enter** to save the value.
147
+
1. Right-click the target on the **Debug** tab, and then select **Set Value**. Next, enter a new value for the variable. Then select **Enter** to save the value.

145
150
146
151
1. Select the **Resume Program** icon to continue to run the program. This time, no exception is caught. You can see that the project runs successfully without any exceptions.
147
152
148
-

153
+

149
154
150
155
## <aname="seealso"></a>Next steps
156
+
151
157
*[Overview: Apache Spark on Azure HDInsight](apache-spark-overview.md)
0 commit comments