@@ -106,6 +106,36 @@ The above mechanism using `kubectl proxy` can be used when we have authenticatio
106
106
kubernetes-client library does not support. Authentication using X509 Client Certs and oauth tokens
107
107
is currently supported.
108
108
109
+ ### Determining the Driver Base URI
110
+
111
+ Kubernetes pods run with their own IP address space. If Spark is run in cluster mode, the driver pod may not be
112
+ accessible to the submitter. However, the submitter needs to send local dependencies from its local disk to the driver
113
+ pod.
114
+
115
+ By default, Spark will place a [ Service] ( https://kubernetes.io/docs/user-guide/services/#type-nodeport ) with a NodePort
116
+ that is opened on every node. The submission client will then contact the driver at one of the node's
117
+ addresses with the appropriate service port.
118
+
119
+ There may be cases where the nodes cannot be reached by the submission client. For example, the cluster may
120
+ only be reachable through an external load balancer. The user may provide their own external URI for Spark driver
121
+ services. To use a your own external URI instead of a node's IP and node port, first set
122
+ ` spark.kubernetes.driver.serviceManagerType ` to ` ExternalAnnotation ` . A service will be created with the annotation
123
+ ` spark-job.alpha.apache.org/provideExternalUri ` , and this service routes to the driver pod. You will need to run a
124
+ separate process that watches the API server for services that are created with this annotation in the application's
125
+ namespace (set by ` spark.kubernetes.namespace ` ). The process should determine a URI that routes to this service
126
+ (potentially configuring infrastructure to handle the URI behind the scenes), and patch the service to include an
127
+ annotation ` spark-job.alpha.apache.org/resolvedExternalUri ` , which has its value as the external URI that your process
128
+ has provided (e.g. ` https://example.com:8080/my-job ` ).
129
+
130
+ Note that the URI provided in the annotation needs to route traffic to the appropriate destination on the pod, which has
131
+ a empty path portion of the URI. This means the external URI provider will likely need to rewrite the path from the
132
+ external URI to the destination on the pod, e.g. https://example.com:8080/spark-app-1/submit will need to route traffic
133
+ to https://<pod_ip>:<service_port>/. Note that the paths of these two URLs are different.
134
+
135
+ If the above is confusing, keep in mind that this functionality is only necessary if the submitter cannot reach any of
136
+ the nodes at the driver's node port. It is recommended to use the default configuration with the node port service
137
+ whenever possible.
138
+
109
139
### Spark Properties
110
140
111
141
Below are some other common properties that are specific to Kubernetes. Most of the other configurations are the same
@@ -207,7 +237,7 @@ from the other deployment modes. See the [configuration page](configuration.html
207
237
<td ><code >false</code ></td >
208
238
<td >
209
239
Whether to expose the driver Web UI port as a service NodePort. Turned off by default because NodePort is a limited
210
- resource. Use alternatives such as Ingress if possible.
240
+ resource.
211
241
</td >
212
242
</tr >
213
243
<tr >
@@ -225,6 +255,21 @@ from the other deployment modes. See the [configuration page](configuration.html
225
255
Interval between reports of the current Spark job status in cluster mode.
226
256
</td >
227
257
</tr >
258
+ <tr >
259
+ <td ><code >spark.kubernetes.driver.serviceManagerType</code ></td >
260
+ <td ><code >NodePort</code ></td >
261
+ <td >
262
+ A tag indicating which class to use for creating the Kubernetes service and determining its URI for the submission
263
+ client. Valid values are currently <code>NodePort</code> and <code>ExternalAnnotation</code>. By default, a service
264
+ is created with the <code>NodePort</code> type, and the driver will be contacted at one of the nodes at the port
265
+ that the nodes expose for the service. If the nodes cannot be contacted from the submitter's machine, consider
266
+ setting this to <code>ExternalAnnotation</code> as described in "Determining the Driver Base URI" above. One may
267
+ also include a custom implementation of <code>org.apache.spark.deploy.rest.kubernetes.DriverServiceManager</code> on
268
+ the submitter's classpath - spark-submit service loads an instance of that class. To use the custom
269
+ implementation, set this value to the custom implementation's return value of
270
+ <code>DriverServiceManager#getServiceManagerType()</code>. This method should only be done as a last resort.
271
+ </td >
272
+ </tr >
228
273
</table >
229
274
230
275
## Current Limitations
0 commit comments