Skip to content

Commit c7029f0

Browse files
authored
Merge pull request #78461 from dagiro/freshness105
freshness105
2 parents 52c9900 + 8cb620d commit c7029f0

File tree

7 files changed

+28
-24
lines changed

7 files changed

+28
-24
lines changed

articles/hdinsight/spark/apache-spark-intellij-tool-plugin.md

Lines changed: 28 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,11 @@
22
title: 'Azure Toolkit for IntelliJ: Create Spark applications for an HDInsight cluster '
33
description: Use the Azure Toolkit for IntelliJ to develop Spark applications written in Scala, and submit them to an HDInsight Spark cluster.
44
author: hrasheed-msft
5+
ms.author: hrasheed
56
ms.reviewer: jasonh
67
ms.service: hdinsight
7-
ms.custom: hdinsightactive
88
ms.topic: conceptual
9-
ms.date: 02/15/2019
10-
ms.author: maxluk
9+
ms.date: 05/31/2019
1110
---
1211
# Use Azure Toolkit for IntelliJ to create Apache Spark applications for an HDInsight cluster
1312

@@ -20,18 +19,20 @@ Use the Azure Toolkit for IntelliJ plug-in to develop [Apache Spark](https://spa
2019
## Prerequisites
2120

2221
* An Apache Spark cluster on HDInsight. For instructions, see [Create Apache Spark clusters in Azure HDInsight](apache-spark-jupyter-spark-sql.md).
23-
* [Oracle Java Development kit](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html). This tutorial uses Java version 8.0.202.
22+
* Storage account name and key. See [Manage storage account settings in the Azure portal](../../storage/common/storage-account-manage.md).
23+
* [Java Developer Kit (JDK) version 8](https://aka.ms/azure-jdks).
2424
* IntelliJ IDEA. This article uses [IntelliJ IDEA Community ver. 2018.3.4](https://www.jetbrains.com/idea/download/).
2525
* Azure Toolkit for IntelliJ. See [Installing the Azure Toolkit for IntelliJ](https://docs.microsoft.com/java/azure/intellij/azure-toolkit-for-intellij-installation?view=azure-java-stable).
2626
* WINUTILS.EXE. See [Problems running Hadoop on Windows](https://wiki.apache.org/hadoop/WindowsProblems).
2727

2828
## Install Scala plugin for IntelliJ IDEA
29+
2930
Perform the following steps to install the Scala plugin:
3031

3132
1. Open IntelliJ IDEA.
3233

3334
2. On the welcome screen, navigate to **Configure** > **Plugins** to open the **Plugins** window.
34-
35+
3536
![Enable scala plugin](./media/apache-spark-intellij-tool-plugin/enable-scala-plugin.png)
3637

3738
3. Select **Install** for the Scala plugin that is featured in the new window.
@@ -40,7 +41,6 @@ Perform the following steps to install the Scala plugin:
4041

4142
4. After the plugin installs successfully, you must restart the IDE.
4243

43-
4444
## Create a Spark Scala application for an HDInsight Spark cluster
4545

4646
1. Start IntelliJ IDEA, and select **Create New Project** to open the **New Project** window.
@@ -82,7 +82,7 @@ Perform the following steps to install the Scala plugin:
8282

8383
9. Add your application source code by doing the following:
8484

85-
a. From Project, navigate to **myApp** > **src** > **main** > **scala**.
85+
a. From **Project**, navigate to **myApp** > **src** > **main** > **scala**.
8686

8787
b. Right-click **scala**, and then navigate to **New** > **Scala Class**.
8888

@@ -115,6 +115,7 @@ Perform the following steps to install the Scala plugin:
115115
The code reads the data from HVAC.csv (available on all HDInsight Spark clusters), retrieves the rows that have only one digit in the seventh column in the CSV file, and writes the output to `/HVACOut` under the default storage container for the cluster.
116116

117117
## Connect to your HDInsight cluster
118+
118119
User can either [sign in to Azure subscription](#sign-in-to-your-azure-subscription), or [link a HDInsight cluster](#link-a-cluster) using Ambari username/password or domain joined credential to connect to your HDInsight cluster.
119120

120121
### Sign in to your Azure subscription
@@ -123,9 +124,9 @@ User can either [sign in to Azure subscription](#sign-in-to-your-azure-subscript
123124

124125
![The Azure Explorer link](./media/apache-spark-intellij-tool-plugin/show-azure-explorer.png)
125126

126-
2. From Azure Explorer, right-click the **Azure** node, and then select **Sign In**.
127+
2. From **Azure Explorer**, right-click the **Azure** node, and then select **Sign In**.
127128

128-
3. In the **Azure Sign In** dialog box, select **Sign in**, and then enter your Azure credentials.
129+
3. In the **Azure Sign In** dialog box, leave **Device Login** selected, and then select **Sign in**. Complete the sign in process.
129130

130131
![The Azure Sign In dialog box](./media/apache-spark-intellij-tool-plugin/view-explorer-2.png)
131132

@@ -142,11 +143,12 @@ User can either [sign in to Azure subscription](#sign-in-to-your-azure-subscript
142143
![An expanded cluster-name node](./media/apache-spark-intellij-tool-plugin/view-explorer-4.png)
143144

144145
### Link a cluster
146+
145147
You can link an HDInsight cluster by using the Apache Ambari managed username. Similarly, for a domain-joined HDInsight cluster, you can link by using the domain and username, such as [email protected]. Also you can link Livy Service cluster.
146148

147149
1. From the menu bar, navigate to **View** > **Tool Windows** > **Azure Explorer**.
148150

149-
2. From Azure Explorer, right-click the **HDInsight** node, and then select **Link A Cluster**.
151+
2. From **Azure Explorer**, right-click the **HDInsight** node, and then select **Link A Cluster**.
150152

151153
![link cluster context menu](./media/apache-spark-intellij-tool-plugin/link-a-cluster-context-menu.png)
152154

@@ -162,7 +164,7 @@ You can link an HDInsight cluster by using the Apache Ambari managed username. S
162164
|User Name| Enter cluster user name, default is admin.|
163165
|Password| Enter password for user name.|
164166

165-
![link hdinsight cluster dialog](./media/apache-spark-intellij-tool-plugin/link-hdinsight-cluster-dialog.png)
167+
![link HDInsight cluster dialog](./media/apache-spark-intellij-tool-plugin/link-hdinsight-cluster-dialog.png)
166168

167169
* **Livy Service**
168170

@@ -176,7 +178,7 @@ You can link an HDInsight cluster by using the Apache Ambari managed username. S
176178
|User Name| Enter cluster user name, default is admin.|
177179
|Password| Enter password for user name.|
178180

179-
![link livy cluster dialog](./media/apache-spark-intellij-tool-plugin/link-livy-cluster-dialog.png)
181+
![link Apache Livy cluster dialog](./media/apache-spark-intellij-tool-plugin/link-livy-cluster-dialog.png)
180182

181183
4. You can see your linked cluster from the **HDInsight** node.
182184

@@ -189,7 +191,7 @@ You can link an HDInsight cluster by using the Apache Ambari managed username. S
189191
## Run a Spark Scala application on an HDInsight Spark cluster
190192
After creating a Scala application, you can submit it to the cluster.
191193

192-
1. From Project, navigate to **myApp** > **src** > **main** > **scala** > **myApp**. Right-click **myApp**, and select **Submit Spark Application** (It will likely be located at the bottom of the list).
194+
1. From **Project**, navigate to **myApp** > **src** > **main** > **scala** > **myApp**. Right-click **myApp**, and select **Submit Spark Application** (It will likely be located at the bottom of the list).
193195

194196
![The Submit Spark Application to HDInsight command](./media/apache-spark-intellij-tool-plugin/hdi-submit-spark-app-1.png)
195197

@@ -203,7 +205,7 @@ After creating a Scala application, you can submit it to the cluster.
203205
|Select an Artifact to submit|Leave default setting.|
204206
|Main class name|The default value is the main class from the selected file. You can change the class by selecting the ellipsis(**...**) and choosing another class.|
205207
|Job configurations|You can change the default keys and/or values. For more information, see [Apache Livy REST API](https://livy.incubator.apache.org./docs/latest/rest-api.html).|
206-
|Command line arguments|You can enter arguments separated by space for the main class if needed.|
208+
|Command-line arguments|You can enter arguments separated by space for the main class if needed.|
207209
|Referenced Jars and Referenced Files|You can enter the paths for the referenced Jars and files if any. For more information: [Apache Spark Configuration](https://spark.apache.org/docs/latest/configuration.html#runtime-environment). See also, [How to upload resources to cluster](https://docs.microsoft.com/azure/storage/blobs/storage-quickstart-blobs-storage-explorer).|
208210
|Job Upload Storage|Expand to reveal additional options.|
209211
|Storage Type|Select **Use Azure Blob to upload** from the drop-down list.|
@@ -225,7 +227,7 @@ You can perform various operations by using Azure Toolkit for IntelliJ. Most of
225227

226228
### Access the job view
227229

228-
1. From Azure Explorer, navigate to **HDInsight** > \<Your Cluster> > **Jobs**.
230+
1. From **Azure Explorer**, navigate to **HDInsight** > \<Your Cluster> > **Jobs**.
229231

230232
![Job view node](./media/apache-spark-intellij-tool-plugin/job-view-node.png)
231233

@@ -245,21 +247,21 @@ You can perform various operations by using Azure Toolkit for IntelliJ. Most of
245247

246248
### Access the Spark history server
247249

248-
1. From Azure Explorer, expand **HDInsight**, right-click your Spark cluster name, and then select **Open Spark History UI**.
250+
1. From **Azure Explorer**, expand **HDInsight**, right-click your Spark cluster name, and then select **Open Spark History UI**.
249251
2. When you're prompted, enter the cluster's admin credentials, which you specified when you set up the cluster.
250252

251253
3. On the Spark history server dashboard, you can use the application name to look for the application that you just finished running. In the preceding code, you set the application name by using `val conf = new SparkConf().setAppName("myApp")`. Therefore, your Spark application name is **myApp**.
252254

253255
### Start the Ambari portal
254256

255-
1. From Azure Explorer, expand **HDInsight**, right-click your Spark cluster name, and then select **Open Cluster Management Portal(Ambari)**.
257+
1. From **Azure Explorer**, expand **HDInsight**, right-click your Spark cluster name, and then select **Open Cluster Management Portal(Ambari)**.
256258

257259
2. When you're prompted, enter the admin credentials for the cluster. You specified these credentials during the cluster setup process.
258260

259261
### Manage Azure subscriptions
260262
By default, Azure Toolkit for IntelliJ lists the Spark clusters from all your Azure subscriptions. If necessary, you can specify the subscriptions that you want to access.
261263

262-
1. From Azure Explorer, right-click the **Azure** root node, and then select **Select Subscriptions**.
264+
1. From **Azure Explorer**, right-click the **Azure** root node, and then select **Select Subscriptions**.
263265

264266
2. From the **Select Subscriptions** window, clear the check boxes next to the subscriptions that you don't want to access, and then select **Close**.
265267

@@ -285,22 +287,24 @@ Ensure you have satisfied the WINUTILS.EXE prerequisite.
285287

286288
![Local Console Set Configuration](./media/apache-spark-intellij-tool-plugin/console-set-configuration.png)
287289

288-
5. From Project, navigate to **myApp** > **src** > **main** > **scala** > **myApp**.
290+
5. From **Project**, navigate to **myApp** > **src** > **main** > **scala** > **myApp**.
289291

290292
6. From the menu bar, navigate to **Tools** > **Spark Console** > **Run Spark Local Console(Scala)**.
291293

292-
7. Then two dialogs may be displayed to ask you if you want to auto fix dependencies. If so, select **Auto Fix**.
294+
7. From the **Setting file system** dialog, select **Yes** to use a mocked file system.
295+
296+
8. Then two dialogs *may* be displayed to ask you if you want to auto fix dependencies. If so, select **Auto Fix**.
293297

294298
![Spark Auto Fix1](./media/apache-spark-intellij-tool-plugin/console-auto-fix1.png)
295299

296300
![Spark Auto Fix2](./media/apache-spark-intellij-tool-plugin/console-auto-fix2.png)
297301

298-
8. The console should look similar to the picture below. In the console window type `sc.appName`, and then press ctrl+Enter. The result will be shown. You can terminate the local console by clicking red button.
302+
9. The console should look similar to the picture below. In the console window type `sc.appName`, and then press ctrl+Enter. The result will be shown. You can terminate the local console by clicking red button.
299303

300304
![Local Console Result](./media/apache-spark-intellij-tool-plugin/local-console-result.png)
301305

302-
303306
### Spark Livy Interactive Session Console(Scala)
307+
304308
It is only supported on IntelliJ 2018.2 and 2018.3.
305309

306310
1. From the menu bar, navigate to **Run** > **Edit Configurations...**.
@@ -318,7 +322,7 @@ It is only supported on IntelliJ 2018.2 and 2018.3.
318322

319323
![Interactive Console Set Configuration](./media/apache-spark-intellij-tool-plugin/interactive-console-configuration.png)
320324

321-
5. From Project, navigate to **myApp** > **src** > **main** > **scala** > **myApp**.
325+
5. From **Project**, navigate to **myApp** > **src** > **main** > **scala** > **myApp**.
322326

323327
6. From the menu bar, navigate to **Tools** > **Spark Console** > **Run Spark Livy Interactive Session Console(Scala)**.
324328

@@ -346,7 +350,7 @@ You can convert the existing Spark Scala applications that you created in Intell
346350

347351
<module org.jetbrains.idea.maven.project.MavenProjectsManager.isMavenModule="true" type="JAVA_MODULE" version="4" UniqueKey="HDInsightTool">
348352

349-
1. Save the changes. Your application should now be compatible with Azure Toolkit for IntelliJ. You can test it by right-clicking the project name in Project. The pop-up menu now has the option **Submit Spark Application to HDInsight**.
353+
1. Save the changes. Your application should now be compatible with Azure Toolkit for IntelliJ. You can test it by right-clicking the project name in **Project**. The pop-up menu now has the option **Submit Spark Application to HDInsight**.
350354

351355
## Troubleshooting
352356

35.5 KB
Loading
149 KB
Loading
52.7 KB
Loading
36.9 KB
Loading
33.1 KB
Loading
22.7 KB
Loading

0 commit comments

Comments
 (0)