Skip to content

Commit 279c7bb

Browse files
committed
RHOAIENG-9499: docs(examples): Dockerfile and suitable README.md for custom image building requirements explanation (#1152)
1 parent 0cf0289 commit 279c7bb

File tree

2 files changed

+172
-0
lines changed

2 files changed

+172
-0
lines changed

examples/README.md

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# Examples
2+
3+
## JupyterLab with Elyra
4+
5+
This Workbench image installs JupyterLab and the ODH-Elyra extension.
6+
7+
The main difference between the [upstream Elyra](https://github.com/elyra-ai/elyra) and the [ODH-Elyra fork](https://github.com/opendatahub-io/elyra) is that the fork implements Argo Pipelines support, which is required for executing pipelines in OpenDataHub/OpenShift AI.
8+
Specifically, the fork already includes the changes from [elyra-ai/elyra #3273](https://github.com/elyra-ai/elyra/pull/3273), which is still pending upstream.
9+
10+
### Design
11+
12+
The workbench is based on a Source-to-Image (S2I) UBI9 Python 3.11 image.
13+
This means—besides having Python 3.11 installed—that it also has the following
14+
* Python virtual environment at `/opt/app-root` is activated by default
15+
* `HOME` directory is set to `/opt/app-root/src`
16+
* port 8888 is `EXPOSE`D by default
17+
18+
These characteristics are required for OpenDataHub workbenches to function.
19+
20+
#### Integration with OpenDataHub Notebook Controller and Notebook Dashboard
21+
22+
#### OpenDataHub Dashboard
23+
24+
Dashboard automatically populates an environment variable named `NOTEBOOK_ARGS` when starting a container from this image.
25+
This variable contains configurations that are necessary to integrate with Dashboard regarding launching the Workbench and logging off.
26+
27+
Reference: https://github.com/opendatahub-io/odh-dashboard/blob/95d80a0cccd5053dc0ca372effcdcd8183a0d5b8/frontend/src/api/k8s/notebooks.ts#L143-L149
28+
29+
Furthermore, when configuring a workbench, the default Persistent Volume Claim (PVC) is created and volume is mounted at `/opt/app-root/src` in the workbench container.
30+
This means that changing the user's `HOME` directory from the expected default is inadvisable.
31+
It further means that whatever the original content of `/opt/app-root/src` in the image may be, it will be shadowed by the PVC.
32+
33+
##### OpenDataHub Notebook Controller
34+
35+
During the Notebook Custom Resource (CR) creation, the mutating webhook in Notebook Controller is triggered.
36+
This webhook is responsible for configuring OAuth Proxy, certificate bundles, pipeline runtime, runtime images, and maybe more.
37+
It also creates a service and OpenShift route to make the Workbench reachable from the outside of the cluster.
38+
39+
**OAuth Proxy** is configured to connect to port 8888 of the workbench container (discussed above) and listen for incoming connections on port 8443.
40+
41+
Reference: https://github.com/opendatahub-io/kubeflow/blob/eacf63cdaed4db766a6503aa413e388e1d2721ef/components/odh-notebook-controller/controllers/notebook_webhook.go#L114-L121
42+
43+
**Certificate bundles** are added as a file-mounted configmap at `/etc/pki/tls/custom-certs/ca-bundle.crt`.
44+
This is a nonstandard location, so it is necessary to also add environment variables that instruct various software to reference this bundle during operation.
45+
46+
Reference:
47+
* https://github.com/opendatahub-io/kubeflow/blob/eacf63cdaed4db766a6503aa413e388e1d2721ef/components/odh-notebook-controller/controllers/notebook_webhook.go#L598
48+
* https://github.com/opendatahub-io/kubeflow/blob/eacf63cdaed4db766a6503aa413e388e1d2721ef/components/odh-notebook-controller/controllers/notebook_webhook.go#L601-L607
49+
50+
**Pipeline runtime configuration** is obtained from a Data Science Pipeline Application (DSPA) CR.
51+
The DSPA CR is first located in the same project where the workbench is being started, a secret with the connection data is created, and then this secret is mounted.
52+
The secret is mounted under `/opt/app-root/runtimes/`.
53+
54+
Reference: https://github.com/opendatahub-io/kubeflow/blob/eacf63cdaed4db766a6503aa413e388e1d2721ef/components/odh-notebook-controller/controllers/notebook_dspa_secret.go#L42C28-L42C50
55+
56+
IMPORTANT: the `setup-elyra.sh` script in this repo relies on this location.
57+
58+
**Runtime images** are processed very similarly to the DSPA configuration.
59+
First, image stream resources are examined, and then a configmap is created and mounted to every newly started workbench.
60+
The mount location is under `/opt/app-root/pipeline-runtimes/`.
61+
62+
Reference: https://github.com/opendatahub-io/kubeflow/blob/eacf63cdaed4db766a6503aa413e388e1d2721ef/components/odh-notebook-controller/controllers/notebook_runtime.go#L25C19-L25C51
63+
64+
IMPORTANT: the `setup-elyra.sh` script in this repo again relies on this location.
65+
66+
### Build
67+
68+
```shell
69+
podman build -f examples/jupyterlab-with-elyra/Dockerfile -t quay.io/your-username/jupyterlab-with-elyra:latest .
70+
podman push quay.io/your-username/jupyterlab-with-elyra:latest
71+
```
72+
73+
### Deploy
74+
75+
Open the `Settings > Workbench images` page in OpenDataHub Dashboard.
76+
Click on the `Import new image` button and add the image you have just pushed.
77+
The `Image location` field should be set to `quay.io/your-username/jupyterlab-with-elyra:latest`, or wherever the image is pushed and available for the cluster to pull.
78+
Values of other fields do not matter for functionality, but they let you keep better track of previously imported images.
79+
80+
There is a special ODH Dashboard feature that alerts you when you are using a workbench image that lists the `elyra` instead of `odh-elyra` package.
81+
This code will have to be updated when `elyra` also gains support for Argo Pipelines, but for now it does the job.
82+
83+
Reference: https://github.com/opendatahub-io/odh-dashboard/blob/2ced77737a1b1fc24b94acac41245da8b29468a4/frontend/src/concepts/pipelines/elyra/utils.ts#L152-L162
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
########
2+
# base #
3+
########
4+
5+
# https://catalog.redhat.com/software/containers/registry/registry.access.redhat.com/repository/ubi9/python-311
6+
FROM registry.access.redhat.com/ubi9/python-311:latest
7+
# Subsequent code may leverages the folloving definitions from the base image:
8+
# ENV APP_ROOT=/opt/app-root
9+
# ENV HOME="${APP_ROOT}/src"
10+
# ENV PYTHON_VERSION=3.11
11+
# ENV PYTHONUNBUFFERED=1
12+
# ENV PYTHONIOENCODING=UTF-8
13+
# ENV PIP_NO_CACHE_DIR=off
14+
# ENV BASH_ENV="${APP_ROOT}/bin/activate"
15+
# ENV ENV="${APP_ROOT}/bin/activate"
16+
17+
# OS packages needs to be installed as root
18+
USER root
19+
20+
# Install useful OS packages
21+
RUN dnf install -y mesa-libGL skopeo && dnf clean all && rm -rf /var/cache/yum
22+
23+
# Other apps and tools shall be installed as default user
24+
# - Kuberneres requires using numeric IDs for the final USER command
25+
# - Openshift's SCC restricted-v2 policy runs images under random UID and GID of 0
26+
USER 1001:0
27+
WORKDIR /opt/app-root
28+
29+
ARG JUPYTER_REUSABLE_UTILS=jupyter/utils
30+
ARG MINIMAL_SOURCE_CODE=jupyter/minimal/ubi9-python-3.11
31+
ARG DATASCIENCE_SOURCE_CODE=jupyter/datascience/ubi9-python-3.11
32+
33+
# Emplace and activate our entrypoint script
34+
COPY ${MINIMAL_SOURCE_CODE}/start-notebook.sh /opt/app-root/bin/
35+
ENTRYPOINT ["/opt/app-root/bin/start-notebook.sh"]
36+
# Copy JupyterLab config from utils directory
37+
COPY ${JUPYTER_REUSABLE_UTILS} /opt/app-root/bin/utils/
38+
# Copy Elyra setup script and various utils where start-notebook.sh expects it
39+
COPY ${DATASCIENCE_SOURCE_CODE}/setup-elyra.sh ${DATASCIENCE_SOURCE_CODE}/utils /opt/app-root/bin/utils/
40+
41+
# Install Python packages and Jupyterlab extensions
42+
# https://www.docker.com/blog/introduction-to-heredocs-in-dockerfiles/
43+
44+
COPY <<EOF requirements.txt
45+
--index-url https://pypi.org/simple
46+
47+
# JupyterLab
48+
jupyterlab==4.2.7
49+
jupyter-bokeh~=4.0.5
50+
jupyter-server~=2.15.0
51+
jupyter-server-proxy~=4.4.0
52+
jupyter-server-terminals~=0.5.3
53+
jupyterlab-git~=0.50.1
54+
jupyterlab-lsp~=5.1.0
55+
jupyterlab-widgets~=3.0.13
56+
jupyter-resource-usage~=1.1.1
57+
nbdime~=4.0.2
58+
nbgitpuller~=1.2.2
59+
60+
# Elyra
61+
odh-elyra==4.2.1
62+
kfp~=2.12.1
63+
64+
# Miscellaneous datascience packages
65+
matplotlib~=3.10.1
66+
numpy~=2.2.3
67+
# ...
68+
EOF
69+
70+
RUN echo "Installing software and packages" && \
71+
pip install -r requirements.txt && \
72+
rm -f ./Pipfile.lock && \
73+
# Prepare directories for elyra runtime configuration
74+
mkdir /opt/app-root/runtimes && \
75+
mkdir /opt/app-root/pipeline-runtimes && \
76+
# Remove default Elyra runtime-images
77+
rm /opt/app-root/share/jupyter/metadata/runtime-images/*.json && \
78+
# Replace Notebook's launcher, "(ipykernel)" with Python's version 3.x.y
79+
sed -i -e "s/Python.*/$(python --version | cut -d '.' -f-2)\",/" /opt/app-root/share/jupyter/kernels/python3/kernel.json && \
80+
# Copy jupyter configuration
81+
cp /opt/app-root/bin/utils/jupyter_server_config.py /opt/app-root/etc/jupyter && \
82+
# Disable announcement plugin of jupyterlab
83+
jupyter labextension disable "@jupyterlab/apputils-extension:announcements" && \
84+
# Fix permissions to support pip in Openshift environments
85+
chmod -R g+w /opt/app-root/lib/python3.11/site-packages && \
86+
fix-permissions /opt/app-root -P
87+
88+
# Switch dir to $HOME
89+
WORKDIR /opt/app-root/src

0 commit comments

Comments
 (0)