-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Description
When launching a Python kernel installed with jupyter yarn-spec install, the launch_ipykernel.py script fails on import because it can't find server_listener.py.
Looking at my YARN logs, I see that only server_listener.py got shipped to workers by spark-submit.
Passing --extra-spark-opts="--py-files ${kernel_dir}/scripts/server_listener.py" to jupyter yarn-spec install solves the problem. But if this is required, it should probably be baked into run.sh.
yarn_spark_python/bin/run.sh has:
eval exec \
"${SPARK_HOME}/bin/spark-submit" \
"${SPARK_OPTS}" \
"${IMPERSONATION_OPTS}" \
"${PROG_HOME}/scripts/launch_ipykernel.py" \
"${LAUNCH_OPTS}" \
"$@"
Here is some relevant Spark documentation describing the use of --py-files to submit bundled dependencies.
Reproduce
I don't know how to reproduce this issue on a vanilla Jupyter install to verify whether it's specific to my installation.
I'm running on a Google Cloud Dataproc cluster, and my install command looks roughly like:
/opt/conda/default/bin/jupyter yarn-spec install \
--prefix="/opt/conda/default" \
--extra-spark-opts="--py-files /opt/conda/default/share/jupyter/kernels/scripts/server_listener.py"
I have a few additional kernel.json customizations that I can share if necessary. One that might be relevant, but I don't think so, is that I deleted PYTHONPATH from env; we're not using python 3.7, so the included path does nothing.
Expected behavior
Kernel startup job run via spark-submit is able to run launch_ipykernel.py with its relevant import.
Context
- Operating System and version: Debian 12
- Browser and version: n/a
- Jupyter Server version: 1.24.0
- Python version: 3.11