@@ -17,8 +17,10 @@ cluster, you may setup a test cluster on your local machine using
17
17
* You must have appropriate permissions to create and list [ pods] ( https://kubernetes.io/docs/user-guide/pods/ ) ,
18
18
[ ConfigMaps] ( https://kubernetes.io/docs/tasks/configure-pod-container/configmap/ ) and
19
19
[ secrets] ( https://kubernetes.io/docs/concepts/configuration/secret/ ) in your cluster. You can verify that
20
- you can list these resources by running ` kubectl get pods ` ` kubectl get configmap ` , and ` kubectl get secrets ` which
20
+ you can list these resources by running ` kubectl get pods ` , ` kubectl get configmap ` , and ` kubectl get secrets ` which
21
21
should give you a list of pods and configmaps (if any) respectively.
22
+ * The service account or credentials used by the driver pods must have appropriate permissions
23
+ as well for editing pod spec.
22
24
* You must have a spark distribution with Kubernetes support. This may be obtained from the
23
25
[ release tarball] ( https://github.com/apache-spark-on-k8s/spark/releases ) or by
24
26
[ building Spark with Kubernetes support] ( ../resource-managers/kubernetes/README.md#building-spark-with-kubernetes-support ) .
@@ -36,15 +38,15 @@ If you wish to use pre-built docker images, you may use the images published in
36
38
<tr ><th >Component</th ><th >Image</th ></tr >
37
39
<tr >
38
40
<td >Spark Driver Image</td >
39
- <td ><code >kubespark/spark-driver:v2.1 .0-kubernetes-0.2 .0</code ></td >
41
+ <td ><code >kubespark/spark-driver:v2.2 .0-kubernetes-0.3 .0</code ></td >
40
42
</tr >
41
43
<tr >
42
44
<td >Spark Executor Image</td >
43
- <td ><code >kubespark/spark-executor:v2.1 .0-kubernetes-0.2 .0</code ></td >
45
+ <td ><code >kubespark/spark-executor:v2.2 .0-kubernetes-0.3 .0</code ></td >
44
46
</tr >
45
47
<tr >
46
48
<td >Spark Initialization Image</td >
47
- <td ><code >kubespark/spark-init:v2.1 .0-kubernetes-0.2 .0</code ></td >
49
+ <td ><code >kubespark/spark-init:v2.2 .0-kubernetes-0.3 .0</code ></td >
48
50
</tr >
49
51
</table >
50
52
@@ -80,9 +82,9 @@ are set up as described above:
80
82
--kubernetes-namespace default \
81
83
--conf spark.executor.instances=5 \
82
84
--conf spark.app.name=spark-pi \
83
- --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.1 .0-kubernetes-0.2 .0 \
84
- --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.1 .0-kubernetes-0.2 .0 \
85
- --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.1 .0-kubernetes-0.2 .0 \
85
+ --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2 .0-kubernetes-0.3 .0 \
86
+ --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2 .0-kubernetes-0.3 .0 \
87
+ --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2 .0-kubernetes-0.3 .0 \
86
88
local:///opt/spark/examples/jars/spark_examples_2.11-2.2.0.jar
87
89
88
90
The Spark master, specified either via passing the ` --master ` command line argument to ` spark-submit ` or by setting
@@ -107,6 +109,18 @@ Finally, notice that in the above example we specify a jar with a specific URI w
107
109
the location of the example jar that is already in the Docker image. Using dependencies that are on your machine's local
108
110
disk is discussed below.
109
111
112
+ When Kubernetes [ RBAC] ( https://kubernetes.io/docs/admin/authorization/rbac/ ) is enabled,
113
+ the ` default ` service account used by the driver may not have appropriate pod ` edit ` permissions
114
+ for launching executor pods. We recommend to add another service account, say ` spark ` , with
115
+ the necessary privilege. For example:
116
+
117
+ kubectl create serviceaccount spark
118
+ kubectl create clusterrolebinding spark-edit --clusterrole edit \
119
+ --serviceaccount default:spark --namespace default
120
+
121
+ With this, one can add ` --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark ` to
122
+ the spark-submit command line above to specify the service account to use.
123
+
110
124
## Dependency Management
111
125
112
126
Application dependencies that are being submitted from your machine need to be sent to a ** resource staging server**
@@ -129,9 +143,9 @@ and then you can compute the value of Pi as follows:
129
143
--kubernetes-namespace default \
130
144
--conf spark.executor.instances=5 \
131
145
--conf spark.app.name=spark-pi \
132
- --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.1 .0-kubernetes-0.2 .0 \
133
- --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.1 .0-kubernetes-0.2 .0 \
134
- --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.1 .0-kubernetes-0.2 .0 \
146
+ --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2 .0-kubernetes-0.3 .0 \
147
+ --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2 .0-kubernetes-0.3 .0 \
148
+ --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2 .0-kubernetes-0.3 .0 \
135
149
--conf spark.kubernetes.resourceStagingServer.uri=http://<address-of-any-cluster-node>:31000 \
136
150
examples/jars/spark_examples_2.11-2.2.0.jar
137
151
@@ -172,9 +186,9 @@ If our local proxy were listening on port 8001, we would have our submission loo
172
186
--kubernetes-namespace default \
173
187
--conf spark.executor.instances=5 \
174
188
--conf spark.app.name=spark-pi \
175
- --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.1 .0-kubernetes-0.2 .0 \
176
- --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.1 .0-kubernetes-0.2 .0 \
177
- --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.1 .0-kubernetes-0.2 .0 \
189
+ --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2 .0-kubernetes-0.3 .0 \
190
+ --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2 .0-kubernetes-0.3 .0 \
191
+ --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2 .0-kubernetes-0.3 .0 \
178
192
local:///opt/spark/examples/jars/spark_examples_2.11-2.2.0.jar
179
193
180
194
Communication between Spark and Kubernetes clusters is performed using the fabric8 kubernetes-client library.
@@ -222,7 +236,7 @@ service because there may be multiple shuffle service instances running in a clu
222
236
a way to target a particular shuffle service.
223
237
224
238
For example, if the shuffle service we want to use is in the default namespace, and
225
- has pods with labels ` app=spark-shuffle-service ` and ` spark-version=2.1 .0 ` , we can
239
+ has pods with labels ` app=spark-shuffle-service ` and ` spark-version=2.2 .0 ` , we can
226
240
use those tags to target that particular shuffle service at job launch time. In order to run a job with dynamic allocation enabled,
227
241
the command may then look like the following:
228
242
@@ -237,7 +251,7 @@ the command may then look like the following:
237
251
--conf spark.dynamicAllocation.enabled=true \
238
252
--conf spark.shuffle.service.enabled=true \
239
253
--conf spark.kubernetes.shuffle.namespace=default \
240
- --conf spark.kubernetes.shuffle.labels="app=spark-shuffle-service,spark-version=2.1 .0" \
254
+ --conf spark.kubernetes.shuffle.labels="app=spark-shuffle-service,spark-version=2.2 .0" \
241
255
local:///opt/spark/examples/jars/spark_examples_2.11-2.2.0.jar 10 400000 2
242
256
243
257
## Advanced
@@ -314,9 +328,9 @@ communicate with the resource staging server over TLS. The trustStore can be set
314
328
--kubernetes-namespace default \
315
329
--conf spark.executor.instances=5 \
316
330
--conf spark.app.name=spark-pi \
317
- --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.1 .0-kubernetes-0.2 .0 \
318
- --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.1 .0-kubernetes-0.2 .0 \
319
- --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.1 .0-kubernetes-0.2 .0 \
331
+ --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2 .0-kubernetes-0.3 .0 \
332
+ --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2 .0-kubernetes-0.3 .0 \
333
+ --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2 .0-kubernetes-0.3 .0 \
320
334
--conf spark.kubernetes.resourceStagingServer.uri=https://<address-of-any-cluster-node>:31000 \
321
335
--conf spark.ssl.kubernetes.resourceStagingServer.enabled=true \
322
336
--conf spark.ssl.kubernetes.resourceStagingServer.clientCertPem=/home/myuser/cert.pem \
0 commit comments