Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,9 @@ jobs:
matrix:
ubuntu_version: ["24.04"]
python_version: ["3.13"]
container_cli:
- docker
- podman
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- podman

This doubles the number of tests in the matrix, which means PR runs will take even longer, and the chance of a failure due to flakiness doubles.

I don't think we need to run all tests with Podman, just a subset, which we can do by adding them to the include: section below

repo_type:
- base
- conda
Expand Down Expand Up @@ -104,6 +107,11 @@ jobs:

- name: Run pytest
run: |
if [ "${{ matrix.container_cli }}" = "podman" ]
then
systemctl --user start podman.socket
export DOCKER_HOST=unix://$(podman info --format '{{.Host.RemoteSocket.Path}}')
fi
pytest --verbose --color=yes --durations=10 --cov=repo2docker tests/${{ matrix.repo_type }}

- uses: codecov/codecov-action@v5
16 changes: 10 additions & 6 deletions docs/source/index.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
# Welcome to `repo2docker`'s documentation

```{important}
Despite the name, `repo2docker` can be used by container technology other than [Docker](https://docs.docker.com/engine/), for example [Podman](https://podman.io/).
```

`repo2docker` lets you **reproducibly build and run user environment container images for interactive computing and data workflows from source code repositories**. Optionally, the container image can be pushed to a Docker registry.

Also, `repo2docker` is the tool used to build container images for [JupyterHub](https://jupyterhub.readthedocs.io/en/stable/) and the tool used by [BinderHub](https://binderhub.readthedocs.io) to build images on demand.

::::{grid}
:::{grid-item-card} 🔧 Build reproducible data science environments from repositories
Build a reproducible data science environment as a Docker image and execute code interactively. Use many [configuration files](#config-files) to control language, tools, and setup instructions.
Build a reproducible data science environment as a container image and execute code interactively. Use many [configuration files](#config-files) to control language, tools, and setup instructions.
:::
:::{grid-item-card} 🚀 Deploy environments in JupyterHub or Binder
Push environment images to a Docker registry for re-use in data science environment services like [JupyterHub](https://jupyterhub.readthedocs.io) or [a Binder instance](https://mybinder.org), or for other communities to build upon your base environment.
Push environment images to a container registry for re-use in data science environment services like [JupyterHub](https://jupyterhub.readthedocs.io) or [a Binder instance](https://mybinder.org), or for other communities to build upon your base environment.
:::
:::{grid-item-card} ☁️ Host repositories in many providers
Host repositories in: a Git server like [GitHub](https://github.com/) or [GitLab](https://gitlab.com/), an open science repository like [Zenodo](https://zenodo.org) or [Figshare](https://figshare.com), a hosted data platform like a [Dataverse installation](https://dataverse.org/), an archive like the
Expand All @@ -19,7 +23,7 @@ Host repositories in: a Git server like [GitHub](https://github.com/) or [GitLab

## What is a user environment container image and why would I build one with `repo2docker`?

A **user environment container image** contains the entire software environment that a user may access from an interactive data science session. For example, it might contain many **programming languages**, **software for data analysis**, or even **content files and datasets** available to anybody that accesses that environment. Container images are built with [Docker](https://www.docker.com/), a standard open source tool for defining, building, and deploying images.
A **user environment container image** contains the entire software environment that a user may access from an interactive data science session. For example, it might contain many **programming languages**, **software for data analysis**, or even **content files and datasets** available to anybody that accesses that environment. Container images are built in accordance with the spectifications published by the [Open Container Initiative](https://opencontainers.org/).

Many data science platforms and services like [JupyterHub](https://jupyterhub.readthedocs.io) and [Binder](https://mybinder.org) launch interactive data science sessions **with a user environment container image attached**, meaning that the user gains access to whatever is in the container image. In short, this allows somebody to define and build the user image one time, in a way that users can reproducibly re-use many times.

Expand All @@ -43,9 +47,9 @@ repo2docker <source-repository>
It performs these steps:

1. Inspects the repository for [configuration files](#config-files). These will be used to build the environment needed to run the repository.
2. Builds a Docker image with an environment specified in these [configuration files](#config-files).
2. Builds a container image with an environment specified in these [configuration files](#config-files).
3. Runs the image to let you explore the repository interactively via Jupyter notebooks, RStudio, or many other interfaces (this is optional).
4. Pushes the images to a Docker registry so that it may be accessed remotely (this is optional).
4. Pushes the images to a container registry so that it may be accessed remotely (this is optional).

[swhid]: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html

Expand All @@ -55,7 +59,7 @@ Please report [bugs](https://github.com/jupyterhub/repo2docker/issues),

## Get started with `repo2docker`

This tutorial walks you through setting up `repo2docker`, building your first environment image, and running it locally with Docker.
This tutorial walks you through setting up `repo2docker`, building your first environment image, and running it locally with a container engine.

```{toctree}
:maxdepth: 2
Expand Down
71 changes: 48 additions & 23 deletions docs/source/start.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,47 +2,72 @@

This tutorial guides you through installing `repo2docker` and building your first environment image.

(install)=
## Prerequisite

## Install `repo2docker`
### Python

`repo2docker` requires Python 3.6 or above on Linux and macOS.
`repo2docker` requires Python 3.6 or above.

:::{admonition} Windows support is experimental
### Container Engine

This [article about using Windows and the WSL](https://nickjanetakis.com/blog/setting-up-docker-for-windows-and-wsl-to-work-flawlessly) (Windows Subsystem for Linux or
Bash on Windows) provides additional information about Windows and Docker.
:::
`repo2docker` requires a container engine compatible with the specification published by the [Open Container Initiative](https://opencontainers.org/).

### Prerequisite: Install Docker
#### Docker

Install [Docker](https://www.docker.com), as it is required to build Docker images.
The [Community Edition](https://docs.docker.com/install/) is available for free.
```{important}
Only the [Docker Engine](https://docs.docker.com/engine/) is an open source. [Docker Desktop](https://docs.docker.com/get-started/get-docker/) requires a license.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Only the [Docker Engine](https://docs.docker.com/engine/) is an open source. [Docker Desktop](https://docs.docker.com/get-started/get-docker/) requires a license.
Only the [Docker Engine](https://docs.docker.com/engine/) is open source. [Docker Desktop](https://docs.docker.com/get-started/get-docker/) requires a license.

```

Recent versions of Docker are recommended.
Follow [Docker's official installation steps](https://docs.docker.com/get-started/get-docker/).

### Install `repo2docker` with `pip`
#### Podman

```{warning}
The name of the package on [PyPI](https://pypi.org/) is [`jupyter-repo2docker`](https://pypi.org/project/jupyter-repo2docker/) instead of `repo2docker`.
```
Follow [Podman's official installation steps](https://podman.io/docs/installation).

We recommend installing `repo2docker` with the `pip` tool:
After complete the installation of Podman,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
After complete the installation of Podman,
After completing the installation of Podman,


```
1. creates a [listening service for Podman](https://docs.podman.io/en/latest/markdown/podman-system-service.1.html) by running
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. creates a [listening service for Podman](https://docs.podman.io/en/latest/markdown/podman-system-service.1.html) by running
1. create a [listening service for Podman](https://docs.podman.io/en/latest/markdown/podman-system-service.1.html) by running


```bash
systemctl --user start podman.socket
```

1. configure the `DOCKER_HOST` environment variable following [Podman's official procedure](https://podman-desktop.io/docs/migrating-from-docker/using-the-docker_host-environment-variable#procedure). You might want to configure the `DOCKER_HOST` environment variable to persist in your `~/.bashrc`.

(install)=

## Install `repo2docker`

### Install `repo2docker` with `pip`

It is recommend to install `repo2docker` with the `pip` tool:

```bash
python3 -m pip install jupyter-repo2docker
```

(usage)=

## Build a repository with `repo2docker`

Now that you've installed Docker and `repo2docker`, we can build a repository.
To do so, follow these steps.
Now that you've installed a container engine and `repo2docker`, you can build a repository.
To do so, continue following this guide.

### Start the container engine

### Start Docker
Ensure that the container engine is running.

Follow the [instructions for starting Docker](https://docs.docker.com/engine/daemon/start/) to start a Docker process.
#### Docker

Follow the [offcial instructions for starting Docker](https://docs.docker.com/engine/daemon/start/).

#### Podman

Run

```bash
podman info
```

### Build an image from a URL

Expand All @@ -55,12 +80,12 @@ jupyter-repo2docker https://github.com/binder-examples/requirements
You'll see `repo2docker` take the following actions:

1. Inspect the repository for [configuration files](#config-files). It will detect the `requirements.txt` file in the repository.
2. Build a Docker image using the configuration files. In this case, the `requirements.txt` file will correspond to a Python environment.
2. Build a container image using the configuration files. In this case, the `requirements.txt` file will correspond to a Python environment.
3. Run the image to let you explore the repository interactively.

Click the link provided and you'll be taken to an interactive Jupyter Notebook interface where you can run commands interactively inside the environment.

## Learn more

This is a simple example building an environment image for your repository.
To learn more about the kinds of source repositories, environments, and use-cases that repo2docker supports, see [the `repo2docker` user guide](./use/index.md).
To learn more about the kinds of source repositories, environments, and use-cases that `repo2docker` supports, see [the `repo2docker` user guide](./use/index.md).
8 changes: 6 additions & 2 deletions repo2docker/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -589,7 +589,11 @@ def start_container(self):

docker_host = os.environ.get("DOCKER_HOST")
if docker_host:
host_name = urlparse(docker_host).hostname
docker_host_parsed = urlparse(docker_host)
if docker_host_parsed.scheme == "unix":
host_name = "127.0.0.1"
else:
host_name = docker_host_parsed.hostname
else:
host_name = "127.0.0.1"
self.hostname = host_name
Expand Down Expand Up @@ -621,7 +625,7 @@ def start_container(self):
"notebook",
"--ip=0.0.0.0",
f"--port={container_port}",
f"--ServerApp.custom_display_url=http://{host_name}:{host_port}",
f"--ServerApp.custom_display_url=http://{self.hostname}:{self.port}",
"--ServerApp.default_url=/lab",
]
else:
Expand Down
61 changes: 47 additions & 14 deletions repo2docker/docker.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,19 @@
from pathlib import Path

from iso8601 import parse_date
from traitlets import Dict, List, Unicode
from traitlets import Dict, List, Unicode, default

import docker

from .engine import Container, ContainerEngine, Image
from .utils import execute_cmd

DOCKER_HOST = os.getenv("DOCKER_HOST")
if DOCKER_HOST is not None and DOCKER_HOST.find("podman") != -1:
DOCKER_CLI = "podman"
else:
DOCKER_CLI = "docker"


class DockerContainer(Container):
def __init__(self, container):
Expand Down Expand Up @@ -66,6 +72,34 @@ class DockerEngine(ContainerEngine):

string_output = True

_container_cli = None

@property
def container_cli(self):
if self._container_cli is not None:
return self._container_cli

cli = DOCKER_CLI

docker_version = subprocess.run([cli, "version"], stdout=subprocess.DEVNULL)
if docker_version.returncode:
raise RuntimeError(f"The {cli} commandline client must be installed")

# docker buildx is based in a plugin that might not be installed
# https://github.com/docker/buildx
#
# podman buildx command is an alias of podman build.
# Not all buildx build features are available in Podman.
docker_buildx_version = subprocess.run(
[cli, "buildx", "version"], stdout=subprocess.DEVNULL
)
if docker_buildx_version.returncode:
raise RuntimeError("The docker buildx plugin must be installed")

self._container_cli = cli

return self._container_cli

extra_init_args = Dict(
{},
help="""
Expand Down Expand Up @@ -105,16 +139,7 @@ def build(
platform=None,
**kwargs,
):
if not shutil.which("docker"):
raise RuntimeError("The docker commandline client must be installed")

# docker buildx is based in a plugin that might not be installed
# https://github.com/docker/buildx
docker_buildx_version = subprocess.run(["docker", "buildx", "version"])
if docker_buildx_version.returncode:
raise RuntimeError("The docker buildx plugin must be installed")

args = ["docker", "buildx", "build", "--progress", "plain"]
args = [self.container_cli, "buildx", "build", "--progress", "plain"]
if load:
if push:
raise ValueError(
Expand Down Expand Up @@ -171,14 +196,22 @@ def inspect_image(self, image):
Return image configuration if it exists, otherwise None
"""
proc = subprocess.run(
["docker", "image", "inspect", image], capture_output=True
[self.container_cli, "image", "inspect", image], capture_output=True
)

if proc.returncode != 0:
return None

config = json.loads(proc.stdout.decode())[0]
return Image(tags=config["RepoTags"], config=config["Config"])
tags = config["RepoTags"]
oci_image_configuration = config["Config"]

# WorkingDir is optional but docker always include it.
# https://github.com/containers/podman/discussions/27313
if "WorkingDir" not in oci_image_configuration:
oci_image_configuration["WorkingDir"] = ""

return Image(tags=tags, config=oci_image_configuration)

@contextmanager
def docker_login(self, username, password, registry):
Expand All @@ -200,7 +233,7 @@ def docker_login(self, username, password, registry):
try:
subprocess.run(
[
"docker",
self.container_cli,
"login",
"--username",
username,
Expand Down
3 changes: 3 additions & 0 deletions tests/norun/test_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import requests

from repo2docker.__main__ import make_r2d
from repo2docker.docker import DOCKER_CLI
from repo2docker.utils import get_free_port

HERE = Path(__file__).parent
Expand Down Expand Up @@ -140,6 +141,7 @@ def registry(host_ip):
proc.wait()


@pytest.mark.skipif(DOCKER_CLI == "podman", reason="Test specific for Docker")
def test_registry_explicit_creds(registry, dind):
"""
Test that we can push to registry when given explicit credentials
Expand Down Expand Up @@ -202,6 +204,7 @@ def test_registry_explicit_creds(registry, dind):
os.environ.update(old_environ)


@pytest.mark.skipif(DOCKER_CLI == "podman", reason="Test specific for Docker")
def test_registry_no_explicit_creds(registry, dind):
"""
Test that we can push to registry *without* explicit credentials but reading from a DOCKER_CONFIG
Expand Down
7 changes: 6 additions & 1 deletion tests/unit/test_app.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import os
from tempfile import TemporaryDirectory
from unittest.mock import patch

Expand Down Expand Up @@ -57,7 +58,11 @@ def test_extra_buildx_build_args(repo_with_content):

args, kwargs = execute_cmd.call_args
cmd = args[0]
assert cmd[:3] == ["docker", "buildx", "build"]
docker_host = os.environ.get("DOCKER_HOST")
if docker_host and docker_host.find("podman") != -1:
assert cmd[:3] == ["podman", "buildx", "build"]
else:
assert cmd[:3] == ["docker", "buildx", "build"]
# make sure it's inserted before the end
assert "--check" in cmd[:-1]

Expand Down
9 changes: 9 additions & 0 deletions tests/unit/test_editable.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,15 @@
import tempfile
import time

import pytest

from repo2docker.__main__ import make_r2d
from repo2docker.docker import DOCKER_CLI

DIR = os.path.join(os.path.dirname(os.path.dirname(__file__)), "dockerfile", "editable")


@pytest.mark.skipif(DOCKER_CLI == "podman", reason="Podman does NOT support bind mount")
def test_editable(run_repo2docker):
"""Run a local repository in edit mode. Verify a new file has been
created afterwards"""
Expand All @@ -28,6 +32,7 @@ def test_editable(run_repo2docker):
os.remove(newfile)


@pytest.mark.skipif(DOCKER_CLI == "podman", reason="Podman does NOT support bind mount")
def test_editable_by_host():
"""Test whether a new file created by the host environment, is
detected in the container"""
Expand All @@ -38,7 +43,11 @@ def test_editable_by_host():
container = app.start_container()

# give the container a chance to start
waiting_container_counter = 0
while container.status != "running":
if waiting_container_counter >= 60:
assert container.status == "running"
waiting_container_counter = waiting_container_counter + 1
time.sleep(1)

try:
Expand Down
Loading