Skip to content

server_listener.py not available to spark job when launching yarn_spark_python kernel #145

@dborowitz

Description

@dborowitz

Description

When launching a Python kernel installed with jupyter yarn-spec install, the launch_ipykernel.py script fails on import because it can't find server_listener.py.

Looking at my YARN logs, I see that only server_listener.py got shipped to workers by spark-submit.

Passing --extra-spark-opts="--py-files ${kernel_dir}/scripts/server_listener.py" to jupyter yarn-spec install solves the problem. But if this is required, it should probably be baked into run.sh.

yarn_spark_python/bin/run.sh has:

eval exec \
     "${SPARK_HOME}/bin/spark-submit" \
     "${SPARK_OPTS}" \
     "${IMPERSONATION_OPTS}" \
     "${PROG_HOME}/scripts/launch_ipykernel.py" \
     "${LAUNCH_OPTS}" \
     "$@"

Here is some relevant Spark documentation describing the use of --py-files to submit bundled dependencies.

Reproduce

I don't know how to reproduce this issue on a vanilla Jupyter install to verify whether it's specific to my installation.

I'm running on a Google Cloud Dataproc cluster, and my install command looks roughly like:

/opt/conda/default/bin/jupyter yarn-spec install \
    --prefix="/opt/conda/default" \
    --extra-spark-opts="--py-files /opt/conda/default/share/jupyter/kernels/scripts/server_listener.py"

I have a few additional kernel.json customizations that I can share if necessary. One that might be relevant, but I don't think so, is that I deleted PYTHONPATH from env; we're not using python 3.7, so the included path does nothing.

Expected behavior

Kernel startup job run via spark-submit is able to run launch_ipykernel.py with its relevant import.

Context

  • Operating System and version: Debian 12
  • Browser and version: n/a
  • Jupyter Server version: 1.24.0
  • Python version: 3.11

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions