Skip to content

Commit f11fb5f

Browse files
authored
Merge pull request #92481 from lenadroid/master
Updates for Spark 2.4
2 parents 4eeb26f + 5926fec commit f11fb5f

File tree

1 file changed

+20
-6
lines changed

1 file changed

+20
-6
lines changed

articles/aks/spark-job.md

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ manager: jeconnoc
77

88
ms.service: container-service
99
ms.topic: article
10-
ms.date: 03/15/2018
10+
ms.date: 10/18/2019
1111
ms.author: alehall
1212
ms.custom: mvc
1313
---
@@ -39,10 +39,16 @@ Create a resource group for the cluster.
3939
az group create --name mySparkCluster --location eastus
4040
```
4141

42-
Create the AKS cluster with nodes that are of size `Standard_D3_v2`.
42+
Create a Service Principal for the cluster. After it is created, you will need the Service Principal appId and password for the next command.
4343

4444
```azurecli
45-
az aks create --resource-group mySparkCluster --name mySparkCluster --node-vm-size Standard_D3_v2
45+
az ad sp create-for-rbac --name SparkSP
46+
```
47+
48+
Create the AKS cluster with nodes that are of size `Standard_D3_v2`, and values of appId and password passed as service-principal and client-secret parameters.
49+
50+
```azurecli
51+
az aks create --resource-group mySparkCluster --name mySparkCluster --node-vm-size Standard_D3_v2 --generate-ssh-keys --service-principal <APPID> --client-secret <PASSWORD>
4652
```
4753

4854
Connect to the AKS cluster.
@@ -60,7 +66,7 @@ Before running Spark jobs on an AKS cluster, you need to build the Spark source
6066
Clone the Spark project repository to your development system.
6167

6268
```bash
63-
git clone -b branch-2.3 https://github.com/apache/spark
69+
git clone -b branch-2.4 https://github.com/apache/spark
6470
```
6571

6672
Change into the directory of the cloned repository and save the path of the Spark source to a variable.
@@ -132,7 +138,7 @@ Run the following commands to add an SBT plugin, which allows packaging the proj
132138

133139
```bash
134140
touch project/assembly.sbt
135-
echo 'addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.6")' >> project/assembly.sbt
141+
echo 'addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")' >> project/assembly.sbt
136142
```
137143

138144
Run these commands to copy the sample code into the newly created project and add all necessary dependencies.
@@ -147,7 +153,7 @@ cat <<EOT >> build.sbt
147153
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.0" % "provided"
148154
EOT
149155

150-
sed -ie 's/scalaVersion.*/scalaVersion := "2.11.11",/' build.sbt
156+
sed -ie 's/scalaVersion.*/scalaVersion := "2.11.11"/' build.sbt
151157
sed -ie 's/name.*/name := "SparkPi",/' build.sbt
152158
```
153159

@@ -210,6 +216,13 @@ Navigate back to the root of Spark repository.
210216
cd $sparkdir
211217
```
212218

219+
Create a service account that has sufficient permissions for running a job.
220+
221+
```bash
222+
kubectl create serviceaccount spark
223+
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
224+
```
225+
213226
Submit the job using `spark-submit`.
214227

215228
```bash
@@ -219,6 +232,7 @@ Submit the job using `spark-submit`.
219232
--name spark-pi \
220233
--class org.apache.spark.examples.SparkPi \
221234
--conf spark.executor.instances=3 \
235+
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
222236
--conf spark.kubernetes.container.image=$REGISTRY_NAME/spark:$REGISTRY_TAG \
223237
$jarUrl
224238
```

0 commit comments

Comments
 (0)