Skip to content

Commit d5abbef

Browse files
authored
Merge pull request #105805 from hrasheed-msft/hdi_scalaapp_update
updates to intro paragraph
2 parents 89d28a6 + 20d44e7 commit d5abbef

File tree

1 file changed

+16
-5
lines changed

1 file changed

+16
-5
lines changed

articles/hdinsight/spark/apache-spark-intellij-tool-plugin.md

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,17 +12,19 @@ ms.date: 09/04/2019
1212

1313
# Tutorial: Use Azure Toolkit for IntelliJ to create Apache Spark applications for HDInsight cluster
1414

15-
This tutorial demonstrates how to use the Azure Toolkit for IntelliJ plug-in to develop Apache Spark applications written in [Scala](https://www.scala-lang.org/), and then submit them to an HDInsight Spark cluster directly from the IntelliJ integrated development environment (IDE). You can use the plug-in in a few ways:
15+
This tutorial demonstrates how to develop Apache Spark applications on Azure HDInsight using the **Azure Toolkit** plug-in for the IntelliJ IDE. [Azure HDInsight](../hdinsight-overview.md) is a managed, open-source analytics service in the cloud that allows you to use open-source frameworks like Hadoop, Apache Spark, Apache Hive, and Apache Kafka.
1616

17-
* Develop and submit a Scala Spark application on an HDInsight Spark cluster.
17+
You can use the **Azure Toolkit** plug-in in a few ways:
18+
19+
* Develop and submit a Scala Spark application to an HDInsight Spark cluster.
1820
* Access your Azure HDInsight Spark cluster resources.
1921
* Develop and run a Scala Spark application locally.
2022

2123
In this tutorial, you learn how to:
2224
> [!div class="checklist"]
2325
> * Use the Azure Toolkit for IntelliJ plug-in
2426
> * Develop Apache Spark applications
25-
> * Submit application to Azure HDInsight cluster
27+
> * Submit an application to Azure HDInsight cluster
2628
2729
## Prerequisites
2830

@@ -103,6 +105,7 @@ Perform the following steps to install the Scala plugin:
103105

104106
d. The **myApp.scala** file then opens in the main view. Replace the default code with the code found below:
105107

108+
```scala
106109
import org.apache.spark.SparkConf
107110
import org.apache.spark.SparkContext
108111

@@ -120,10 +123,12 @@ Perform the following steps to install the Scala plugin:
120123
}
121124

122125
}
126+
```
123127

124128
The code reads the data from HVAC.csv (available on all HDInsight Spark clusters), retrieves the rows that have only one digit in the seventh column in the CSV file, and writes the output to `/HVACOut` under the default storage container for the cluster.
125129

126130
## Connect to your HDInsight cluster
131+
127132
User can either [sign in to Azure subscription](#sign-in-to-your-azure-subscription), or [link a HDInsight cluster](#link-a-cluster) using Ambari username/password or domain joined credential to connect to your HDInsight cluster.
128133

129134
### Sign in to your Azure subscription
@@ -363,22 +368,24 @@ It is convenient for you to foresee the script result by sending some code to th
363368
## Integrate with HDInsight Identity Broker (HIB)
364369

365370
### Connect to your HDInsight ESP cluster with Id Broker (HIB)
371+
366372
You can follow the normal steps to sign in to Azure subscription to connect to your HDInsight ESP cluster with Id Broker (HIB). After sign in, you will see the cluster list in Azure Explorer. For more instructions, see [Connect to your HDInsight cluster](#connect-to-your-hdinsight-cluster).
367373

368374
### Run a Spark Scala application on an HDInsight ESP cluster with Id Broker (HIB)
375+
369376
You can follow the normal steps to submit job to HDInsight ESP cluster with Id Broker (HIB). Refer to [Run a Spark Scala application on an HDInsight Spark cluster](#run-a-spark-scala-application-on-an-hdinsight-spark-cluster) for more instructions.
370377

371378
We upload the necessary files to a folder named with your sign in account, and you can see the upload path in the configuration file.
372379

373380
![upload path in the configuration](./media/apache-spark-intellij-tool-plugin/upload-path-in-the-configuration.png)
374381

375382
### Spark console on an HDInsight ESP cluster with Id Broker (HIB)
383+
376384
You can run Spark Local Console(Scala) or run Spark Livy Interactive Session Console(Scala) on an HDInsight ESP cluster with Id Broker (HIB). Refer to [Spark Console](#spark-console) for more instructions.
377385

378386
> [!NOTE]
379387
> For the HDInsight ESP cluster with Id Broker (HIB), [link a cluster](#link-a-cluster) and [debug Apache Spark applications remotely](#debug-apache-spark-applications-locally-or-remotely-on-an-hdinsight-cluster) is not supported currently.
380388
381-
382389
## Reader-only role
383390

384391
When users submit job to a cluster with reader-only role permission, Ambari credentials is required.
@@ -438,11 +445,15 @@ You can convert the existing Spark Scala applications that you created in Intell
438445

439446
2. At the root level is a **module** element like the following:
440447

448+
```
441449
<module org.jetbrains.idea.maven.project.MavenProjectsManager.isMavenModule="true" type="JAVA_MODULE" version="4">
450+
```
442451

443452
Edit the element to add `UniqueKey="HDInsightTool"` so that the **module** element looks like the following:
444453

454+
```
445455
<module org.jetbrains.idea.maven.project.MavenProjectsManager.isMavenModule="true" type="JAVA_MODULE" version="4" UniqueKey="HDInsightTool">
456+
```
446457

447458
3. Save the changes. Your application should now be compatible with Azure Toolkit for IntelliJ. You can test it by right-clicking the project name in Project. The pop-up menu now has the option **Submit Spark Application to HDInsight**.
448459

@@ -467,4 +478,4 @@ If you're not going to continue to use this application, delete the cluster that
467478
In this tutorial, you learned how to use the Azure Toolkit for IntelliJ plug-in to develop Apache Spark applications written in [Scala](https://www.scala-lang.org/), and then submitted them to an HDInsight Spark cluster directly from the IntelliJ integrated development environment (IDE). Advance to the next article to see how the data you registered in Apache Spark can be pulled into a BI analytics tool such as Power BI.
468479

469480
> [!div class="nextstepaction"]
470-
> [Analyze data using BI tools](apache-spark-use-bi-tools.md)
481+
> [Analyze Apache Spark data using Power BI](apache-spark-use-bi-tools.md)

0 commit comments

Comments
 (0)