You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/spark/apache-spark-create-standalone-application.md
+14-12Lines changed: 14 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,16 +5,16 @@ author: hrasheed-msft
5
5
ms.author: hrasheed
6
6
ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
-
ms.custom: hdinsightactive,mvc
9
8
ms.topic: tutorial
10
-
ms.date: 06/26/2019
9
+
ms.custom: hdinsightactive,mvc
10
+
ms.date: 02/28/2020
11
11
12
12
#customer intent: As a developer new to Apache Spark and to Apache Spark in Azure HDInsight, I want to learn how to create a Scala Maven application for Spark in HDInsight using IntelliJ.
13
13
---
14
14
15
15
# Tutorial: Create a Scala Maven application for Apache Spark in HDInsight using IntelliJ
16
16
17
-
In this tutorial, you learn how to create an [Apache Spark](https://spark.apache.org/) application written in [Scala](https://www.scala-lang.org/) using [Apache Maven](https://maven.apache.org/) with IntelliJ IDEA. The article uses Apache Maven as the build system and starts with an existing Maven archetype for Scala provided by IntelliJ IDEA. Creating a Scala application in IntelliJ IDEA involves the following steps:
17
+
In this tutorial, you learn how to create an [Apache Spark](./apache-spark-overview.md) application written in [Scala](https://www.scala-lang.org/) using [Apache Maven](https://maven.apache.org/) with IntelliJ IDEA. The article uses Apache Maven as the build system and starts with an existing Maven archetype for Scala provided by IntelliJ IDEA. Creating a Scala application in IntelliJ IDEA involves the following steps:
18
18
19
19
* Use Maven as the build system.
20
20
* Update Project Object Model (POM) file to resolve Spark module dependencies.
@@ -58,15 +58,15 @@ Perform the following steps to install the Scala plugin:
58
58
59
59
1. Start IntelliJ IDEA, and select **Create New Project** to open the **New Project** window.
60
60
61
-
2. Select **Azure Spark/HDInsight** from the left pane.
61
+
2. Select **Apache Spark/HDInsight** from the left pane.
62
62
63
63
3. Select **Spark Project (Scala)** from the main window.
64
64
65
-
4. From the **Build tool** drop-down list, select one of the following:
65
+
4. From the **Build tool** drop-down list, select one of the following values:
66
66
***Maven** for Scala project-creation wizard support.
67
67
***SBT** for managing the dependencies and building for the Scala project.
68
68
69
-

69
+

70
70
71
71
5. Select **Next**.
72
72
@@ -95,22 +95,24 @@ Perform the following steps to install the Scala plugin:
95
95
96
96
5. From the list of archetypes, select **org.scala-tools.archetypes:scala-archetype-simple**. This archetype creates the right directory structure and downloads the required default dependencies to write Scala program.
97
97
98
-

98
+

99
99
100
100
6. Select **Next**.
101
101
102
-
7. Provide relevant values for **GroupId**, **ArtifactId**, and **Version**. The following values are used in this tutorial:
102
+
7.Expand **Artifact Coordinates**. Provide relevant values for **GroupId**, and **ArtifactId**. **Name**, and **Location** will auto-populate. The following values are used in this tutorial:
103
103
104
104
-**GroupId:** com.microsoft.spark.example
105
105
-**ArtifactId:** SparkSimpleApp
106
106
107
+

108
+
107
109
8. Select **Next**.
108
110
109
111
9. Verify the settings and then select **Next**.
110
112
111
113
10. Verify the project name and location, and then select **Finish**. The project will take a few minutes to import.
112
114
113
-
11. Once the project has imported, from the left pane navigate to **SparkSimpleApp** > **src** > **test** > **scala** > **com** > **microsoft** > **spark** > **example**. Right-click **MySpec**, and then select **Delete...**. You do not need this file for the application. Select **OK** in the dialog box.
115
+
11. Once the project has imported, from the left pane navigate to **SparkSimpleApp** > **src** > **test** > **scala** > **com** > **microsoft** > **spark** > **example**. Right-click **MySpec**, and then select **Delete...**. You don't need this file for the application. Select **OK** in the dialog box.
114
116
115
117
12. In the subsequent steps, you update the **pom.xml** to define the dependencies for the Spark Scala application. For those dependencies to be downloaded and resolved automatically, you must configure Maven accordingly.
116
118
@@ -120,7 +122,7 @@ Perform the following steps to install the Scala plugin:
120
122
121
123
15. Select the **Import Maven projects automatically** checkbox.
122
124
123
-
16. Select **Apply**, and then select **OK**. You will then be returned to the project window.
125
+
16. Select **Apply**, and then select **OK**. You'll then be returned to the project window.
124
126
125
127

126
128
@@ -185,7 +187,7 @@ Perform the following steps to install the Scala plugin:
185
187
186
188

187
189
188
-
6. The **Output Layout** tab lists all the jars that are included as part of the Maven project. You can select and delete the ones on which the Scala application has no direct dependency. For the application, you are creating here, you can remove all but the last one (**SparkSimpleApp compile output**). Select the jars to delete and then select the negative symbol **-**.
190
+
6. The **Output Layout** tab lists all the jars that are included as part of the Maven project. You can select and delete the ones on which the Scala application has no direct dependency. For the application, you're creating here, you can remove all but the last one (**SparkSimpleApp compile output**). Select the jars to delete and then select the negative symbol **-**.
189
191
190
192

191
193
@@ -199,7 +201,7 @@ Perform the following steps to install the Scala plugin:
199
201
200
202
To run the application on the cluster, you can use the following approaches:
201
203
202
-
***Copy the application jar to the Azure storage blob** associated with the cluster. You can use [**AzCopy**](../../storage/common/storage-use-azcopy.md), a command-line utility, to do so. There are many other clients as well that you can use to upload data. You can find more about them at [Upload data for Apache Hadoop jobs in HDInsight](../hdinsight-upload-data.md).
204
+
***Copy the application jar to the Azure Storage blob** associated with the cluster. You can use [**AzCopy**](../../storage/common/storage-use-azcopy.md), a command-line utility, to do so. There are many other clients as well that you can use to upload data. You can find more about them at [Upload data for Apache Hadoop jobs in HDInsight](../hdinsight-upload-data.md).
203
205
204
206
***Use Apache Livy to submit an application job remotely** to the Spark cluster. Spark clusters on HDInsight includes Livy that exposes REST endpoints to remotely submit Spark jobs. For more information, see [Submit Apache Spark jobs remotely using Apache Livy with Spark clusters on HDInsight](apache-spark-livy-rest-interface.md).
0 commit comments