You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This tutorial illustrates various scenarios and configuration options when using JupyterHub on Kubernetes.
6
-
The custom resources and configuration settings that are discussed here are based on the JupyterHub-Keycloak demo, so you may find it helpful to have that demo running to reference things as you read through this tutorial.
6
+
The custom resources and configuration settings that are discussed here are based on the xref:demos:jupyterhub-keycloak.adoc[JupyterHub-Keycloak demo], so you may find it helpful to have that demo running to reference the various https://github.com/stackabletech/demos/blob/main/stacks/jupyterhub-keycloak[resources] as you read through this tutorial.
7
7
The example notebook is used to demonstrate simple read/write interactions with an S3 storage backend using Apache Spark.
<2> this information is passed to a variable in one of the start-up config scripts...
84
-
<3> ...and then used for JupyterHub settings
82
+
<1> Endpoint information read from the ConfigMap
83
+
<2> This information is passed to a variable in one of the start-up config scripts
84
+
<3> And then used for JupyterHub settings (this is where port `31095` is hard-coded for the proxy service)
85
+
86
+
NOTE: The node port IP found in the ConfigMap `keycloak-address` can be used for opening the JupyterHb UI.
87
+
On Kind this can be any node - not necessarily the one where the proxy Pod is running.
88
+
This is due to the way in which Docker networking is used within the cluster.
89
+
On other clusters it might be necessary to use the exact Node on which the proxy is running.
85
90
86
91
=== Discovery
87
92
@@ -136,7 +141,6 @@ kind: Deployment
136
141
----
137
142
====
138
143
139
-
140
144
=== Security
141
145
142
146
We create a keystore with a self-generated and self-signed certificate and mount it so that the keystore file can be used when starting keycloak:
@@ -203,7 +207,7 @@ For the self-signed certificate to be accepted during the handshake between Jupy
203
207
204
208
=== Realm
205
209
206
-
The Keycloak https://github.com/stackabletech/demos/blob/main/stacks/jupyterhub-keycloak/keycloak-realm-config.yamlfor the demo basically contains a set of users and groups, along with a simple client definition:
210
+
The Keycloak https://github.com/stackabletech/demos/blob/main/stacks/jupyterhub-keycloak/keycloak-realm-config.yaml[realm configuration] for the demo basically contains a set of users and groups, along with a JupyterHub client definition:
207
211
208
212
[source,yaml]
209
213
----
@@ -273,7 +277,7 @@ To authenticate against a Keycloak instance it is necessary to provide the follo
273
277
274
278
=== GenericOAuthenticator
275
279
276
-
This section of the JupyterHub values specifies that we are using GenericOAuthenticator for our authentication.
280
+
This section of the JupyterHub configuration specifies that we are using GenericOAuthenticator for our authentication:
277
281
278
282
[source,yaml]
279
283
----
@@ -296,8 +300,8 @@ This section of the JupyterHub values specifies that we are using GenericOAuthen
296
300
...
297
301
----
298
302
299
-
<1> We need to either provide a list of users using `allowed_users`, or to explicitly allow _all_ users, as done here.
300
-
We will delegate this to Keycloak so that we do not have to maintain users in two places.
303
+
<1> We need to either provide a list of users using `allowed_users`, or to explicitly allow _all_ users, as done here
304
+
We will delegate this to Keycloak so that we do not have to maintain users in two places
301
305
<2> Each admin user will have access to an Admin tab on the JupyterHub UI where certain user-management actions can be carried out.
302
306
<3> Define the Keycloak scope
303
307
<4> Specifies which authenticator class to use
@@ -348,9 +352,9 @@ This can be seen below:
348
352
349
353
<1> Specify which certificate(s) should be used internally (in the code above this is using the default certificate, but is included for the sake of completion)
350
354
<2> Create the certificate with the same secret class (`tls`) as Keycloak
351
-
<3> Mount this certificate.
355
+
<3> Mount this certificate
352
356
If the default file is not overwritten, but is mounted to a new file in the same directory, then the certificates should be updated by calling e.g. `update-ca-certificates`.
353
-
<4> ensure python is using the same certificate.
357
+
<4> Ensure python is using the same certificate
354
358
355
359
[#endpoints]
356
360
=== Endpoints
@@ -365,11 +369,11 @@ As mentioned in the <<services, Services>> section above, we want to define the
365
369
03-set-endpoints: |
366
370
import os
367
371
from oauthenticator.generic import GenericOAuthenticator
@@ -428,7 +432,7 @@ This script instructs JupyterHub to use `KubeSpawner` to create a service refere
428
432
429
433
=== Profiles
430
434
431
-
The `singleuser.profileList` section of the Helm chart values allows us to define notebook profiles by setting the CPU, Memory and Image combinations that can be selected. For instance, the profiles below allows us to select `2/4/...` CPUs, `4/8/...` GB RAM and to select one of two images.
435
+
The `singleuser.profileList` section of the Helm chart values allows us to define notebook profiles by setting the CPU, Memory and Image combinations that can be selected. For instance, the profiles below allows us to select 2/4/etc. CPUs, 4/8/etc. GB RAM and to select one of two images.
432
436
433
437
[source,yaml]
434
438
----
@@ -528,8 +532,8 @@ USER spark
528
532
529
533
NOTE: The example notebook in the demo will start a distributed Spark cluster, whereby the notebook acts as the driver which spawns a number of executors.
530
534
The driver uses the user-specific <<driver, driver service>> to pass job dependencies to each executor.
531
-
The Spark versions of these dependencies must be the same, or else serialization errors can occur.
532
-
This is increasingly likely in cases where Java or Scala classes do not have a specified `serialVersionUID`, in which case one will be calculated at runtime based on the contents of each class (method signatures etc.): if the contents of these class files have been changed, then the UID may differ between driver and executor.
535
+
The Spark versions of these dependencies must be the same on both the driver and executor, or else serialization errors can occur.
536
+
For Java or Scala classes that do not have a specified `serialVersionUID`, one will be calculated at runtime based on the contents of each class (method signatures etc.): if the contents of these class files have been changed, then the UID may differ between driver and executor.
533
537
To avoid this, care needs to be taken to use images for the notebook and the Spark job that are using a common Spark build.
0 commit comments