Skip to content

Commit 1eddc82

Browse files
committed
freshness4
1 parent 7150959 commit 1eddc82

File tree

6 files changed

+14
-12
lines changed

6 files changed

+14
-12
lines changed

articles/hdinsight/spark/apache-spark-create-standalone-application.md

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,16 @@ author: hrasheed-msft
55
ms.author: hrasheed
66
ms.reviewer: jasonh
77
ms.service: hdinsight
8-
ms.custom: hdinsightactive,mvc
98
ms.topic: tutorial
10-
ms.date: 06/26/2019
9+
ms.custom: hdinsightactive,mvc
10+
ms.date: 02/28/2020
1111

1212
#customer intent: As a developer new to Apache Spark and to Apache Spark in Azure HDInsight, I want to learn how to create a Scala Maven application for Spark in HDInsight using IntelliJ.
1313
---
1414

1515
# Tutorial: Create a Scala Maven application for Apache Spark in HDInsight using IntelliJ
1616

17-
In this tutorial, you learn how to create an [Apache Spark](https://spark.apache.org/) application written in [Scala](https://www.scala-lang.org/) using [Apache Maven](https://maven.apache.org/) with IntelliJ IDEA. The article uses Apache Maven as the build system and starts with an existing Maven archetype for Scala provided by IntelliJ IDEA. Creating a Scala application in IntelliJ IDEA involves the following steps:
17+
In this tutorial, you learn how to create an [Apache Spark](./apache-spark-overview.md) application written in [Scala](https://www.scala-lang.org/) using [Apache Maven](https://maven.apache.org/) with IntelliJ IDEA. The article uses Apache Maven as the build system and starts with an existing Maven archetype for Scala provided by IntelliJ IDEA. Creating a Scala application in IntelliJ IDEA involves the following steps:
1818

1919
* Use Maven as the build system.
2020
* Update Project Object Model (POM) file to resolve Spark module dependencies.
@@ -58,15 +58,15 @@ Perform the following steps to install the Scala plugin:
5858

5959
1. Start IntelliJ IDEA, and select **Create New Project** to open the **New Project** window.
6060

61-
2. Select **Azure Spark/HDInsight** from the left pane.
61+
2. Select **Apache Spark/HDInsight** from the left pane.
6262

6363
3. Select **Spark Project (Scala)** from the main window.
6464

65-
4. From the **Build tool** drop-down list, select one of the following:
65+
4. From the **Build tool** drop-down list, select one of the following values:
6666
* **Maven** for Scala project-creation wizard support.
6767
* **SBT** for managing the dependencies and building for the Scala project.
6868

69-
![IntelliJ The New Project dialog box](./media/apache-spark-create-standalone-application/create-hdi-scala-app.png)
69+
![IntelliJ The New Project dialog box](./media/apache-spark-create-standalone-application/intellij-project-apache-spark.png)
7070

7171
5. Select **Next**.
7272

@@ -95,22 +95,24 @@ Perform the following steps to install the Scala plugin:
9595

9696
5. From the list of archetypes, select **org.scala-tools.archetypes:scala-archetype-simple**. This archetype creates the right directory structure and downloads the required default dependencies to write Scala program.
9797

98-
![IntelliJ IDEA create Maven project](./media/apache-spark-create-standalone-application/create-maven-project.png)
98+
![IntelliJ IDEA create Maven project](./media/apache-spark-create-standalone-application/intellij-project-create-maven.png)
9999

100100
6. Select **Next**.
101101

102-
7. Provide relevant values for **GroupId**, **ArtifactId**, and **Version**. The following values are used in this tutorial:
102+
7. Expand **Artifact Coordinates**. Provide relevant values for **GroupId**, and **ArtifactId**. **Name**, and **Location** will auto-populate. The following values are used in this tutorial:
103103

104104
- **GroupId:** com.microsoft.spark.example
105105
- **ArtifactId:** SparkSimpleApp
106106

107+
![IntelliJ IDEA create Maven project](./media/apache-spark-create-standalone-application/intellij-artifact-coordinates.png)
108+
107109
8. Select **Next**.
108110

109111
9. Verify the settings and then select **Next**.
110112

111113
10. Verify the project name and location, and then select **Finish**. The project will take a few minutes to import.
112114

113-
11. Once the project has imported, from the left pane navigate to **SparkSimpleApp** > **src** > **test** > **scala** > **com** > **microsoft** > **spark** > **example**. Right-click **MySpec**, and then select **Delete...**. You do not need this file for the application. Select **OK** in the dialog box.
115+
11. Once the project has imported, from the left pane navigate to **SparkSimpleApp** > **src** > **test** > **scala** > **com** > **microsoft** > **spark** > **example**. Right-click **MySpec**, and then select **Delete...**. You don't need this file for the application. Select **OK** in the dialog box.
114116

115117
12. In the subsequent steps, you update the **pom.xml** to define the dependencies for the Spark Scala application. For those dependencies to be downloaded and resolved automatically, you must configure Maven accordingly.
116118

@@ -120,7 +122,7 @@ Perform the following steps to install the Scala plugin:
120122

121123
15. Select the **Import Maven projects automatically** checkbox.
122124

123-
16. Select **Apply**, and then select **OK**. You will then be returned to the project window.
125+
16. Select **Apply**, and then select **OK**. You'll then be returned to the project window.
124126

125127
![Configure Maven for automatic downloads](./media/apache-spark-create-standalone-application/configure-maven-download.png)
126128

@@ -185,7 +187,7 @@ Perform the following steps to install the Scala plugin:
185187

186188
![IntelliJ IDEA project structure jar from module](./media/apache-spark-create-standalone-application/hdinsight-create-jar3.png)
187189

188-
6. The **Output Layout** tab lists all the jars that are included as part of the Maven project. You can select and delete the ones on which the Scala application has no direct dependency. For the application, you are creating here, you can remove all but the last one (**SparkSimpleApp compile output**). Select the jars to delete and then select the negative symbol **-**.
190+
6. The **Output Layout** tab lists all the jars that are included as part of the Maven project. You can select and delete the ones on which the Scala application has no direct dependency. For the application, you're creating here, you can remove all but the last one (**SparkSimpleApp compile output**). Select the jars to delete and then select the negative symbol **-**.
189191

190192
![IntelliJ IDEA project structure delete output](./media/apache-spark-create-standalone-application/hdi-delete-output-jars.png)
191193

@@ -199,7 +201,7 @@ Perform the following steps to install the Scala plugin:
199201

200202
To run the application on the cluster, you can use the following approaches:
201203

202-
* **Copy the application jar to the Azure storage blob** associated with the cluster. You can use [**AzCopy**](../../storage/common/storage-use-azcopy.md), a command-line utility, to do so. There are many other clients as well that you can use to upload data. You can find more about them at [Upload data for Apache Hadoop jobs in HDInsight](../hdinsight-upload-data.md).
204+
* **Copy the application jar to the Azure Storage blob** associated with the cluster. You can use [**AzCopy**](../../storage/common/storage-use-azcopy.md), a command-line utility, to do so. There are many other clients as well that you can use to upload data. You can find more about them at [Upload data for Apache Hadoop jobs in HDInsight](../hdinsight-upload-data.md).
203205

204206
* **Use Apache Livy to submit an application job remotely** to the Spark cluster. Spark clusters on HDInsight includes Livy that exposes REST endpoints to remotely submit Spark jobs. For more information, see [Submit Apache Spark jobs remotely using Apache Livy with Spark clusters on HDInsight](apache-spark-livy-rest-interface.md).
205207

164 KB
Loading
106 KB
Loading
280 KB
Loading

0 commit comments

Comments
 (0)