Skip to content

Feat: implemented --wait* for podman-compose up and start#1356

Open
mammo0 wants to merge 6 commits intocontainers:mainfrom
mammo0:feat_up_start--wait
Open

Feat: implemented --wait* for podman-compose up and start#1356
mammo0 wants to merge 6 commits intocontainers:mainfrom
mammo0:feat_up_start--wait

Conversation

@mammo0
Copy link
Contributor

@mammo0 mammo0 commented Dec 18, 2025

Hello,

this PR addresses issues #710 and #1329.

The arguments --wait and --wait-timeout are added to podman-compose up and start commands. After starting the containers it waits until all services in the compose-file are either running or healthy (if they have a HEALTCHCHECK).

A note on the test cases: Currently they are skipped by GitHub actions. This comes from the used Docker image, which is currently the Debian bookworm version:

image: "docker.io/library/python:${{ matrix.python-version }}-bookworm"

And bookworm uses Podman version 4.3.1 (https://packages.debian.org/bookworm/podman). This version does not support checking for running or healthy state in podman wait.

Copy link
Collaborator

@p12tic p12tic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks. Current main branch now supports testing on newer podman versions, please rebase. Also the PR needs a release note.

@mammo0 mammo0 force-pushed the feat_up_start--wait branch 2 times, most recently from f7c8059 to a764bf1 Compare March 2, 2026 15:40
@mammo0
Copy link
Contributor Author

mammo0 commented Mar 2, 2026

Hello @p12tic

I made the changes according to your small requests.

But I don't know why the GH actions test cases are failling. Basically the podman commands with a --wait argument are not terminating. Therefore I added a 30 seconds timeout, so the actions are not blocked for hours.

On my local machine all 4 test cases are working fine as they should. And I can't reproduce the behaviour of the GH actions. Therefore I can't really debug the problem.

Maybe you have an idea?

@mammo0
Copy link
Contributor Author

mammo0 commented Mar 3, 2026

I managed to reproduce the error with act using a rootful Docker environment (rootless Docker and Podman does not even start the tests).

But I don't know why the subprocess (podman run --wait) does not finish in the test cases.

@p12tic
Copy link
Collaborator

p12tic commented Mar 3, 2026

Code looks alright. I will need to debug what's happening with the hangs at some point.

By the way, you can launch Github actions on your own fork here https://github.com/mammo0/podman-compose/actions . That should give you access to the same functionality that I can access.

@mammo0
Copy link
Contributor Author

mammo0 commented Mar 4, 2026

I further tracked down the issue and I think it's a problem with Podman, not compose:

I created the following image with Python 3.9 and Podman 4.9.5:

FROM docker.io/library/python:3.9-bookworm

ARG BASE_URL="https://github.com/p12tic/podman-compose-test-data/raw/refs/heads/master/deb_files"
ARG PODMAN_DEB="${BASE_URL}/deb-files-podman-4.9.5/podman_4.9.5+compose1-1_amd64.deb"
ARG CRUN_DEB="${BASE_URL}/deb-files-crun-1-21/crun_1.21-1_amd64.deb"

RUN \
    DEBIAN_FRONTEND=noninteractive apt update -y && apt install -y buildah && \
    trap 'rm -rf "$$TMPDIR"' EXIT && \
        # strip everything up to the last slash
        echo "Downloading $PODMAN_DEB ..." && \
        curl -fsSL -o "/tmp/podman.deb" "$PODMAN_DEB" && \
        # strip everything up to the last slash
        echo "Downloading $CRUN_DEB ..." && \
        curl -fsSL -o "/tmp/crun.deb" "$CRUN_DEB" && \
    echo "Installing packages ..." && \
    apt-get update -qq && \
    apt-get install -y -qq /tmp/*.deb

Then I opened a container shell like its done in the GH actions:

docker run -it --entrypoint bash -v $(pwd):/data --privileged --cgroupns=host <above_image>

And I built the test image:

podman build -t podman-compose-wait-test tests/integration/wait/

I did this with Podman version 4.9.5 and on my host machine running version 5.8.0. On both I started a container with the parameters of the test compose file (tests/integration/wait/docker-compose.yml):

podman run -d --rm --healthcheck-command '["CMD-SHELL", "test -f /healthy"]' --health-interval=1s --health-retries=5 podman-compose-wait-test

On my host machine the container status switched to healthy after 10 seconds (as expected). But with Podman 4.9.5 the status does not change and is always starting. Even after couple of minutes. I need to run

podman healthcheck run <container>

which performs the healthcheck and after that the container status is also healthy on 4.9.5.

@p12tic
Copy link
Collaborator

p12tic commented Mar 4, 2026

Thanks a lot for testing.

I think in such case we should adjust if compose.podman_version is not None and strverscmp_lt(compose.podman_version, "4.6.0"): line in the code to the first version that can run the integration test at least. Do you think this makes sense?

Maybe it wouldn't take too much time to test with 5.0.3 and 5.4.2 to verify that healthcheck really does not work on these versions instead of this being github actions setup problem?

If you rebase on top of latest master, you will get 5.7.1 podman configuration as well. I hope that it is sufficient to run healthchecks automatically.

@mammo0
Copy link
Contributor Author

mammo0 commented Mar 4, 2026

Rebased on latest master and tested with Podman 5.7.1. Still not working...

Maybe it is because of running Podman in Docker?

I tried it also with the latest Podman image (5.7.1) on quay.io (don't know why it isn't on 5.8.0):

docker run --privileged --cgroupns=host -it --rm -v $(pwd):/data quay.io/containers/podman:latest

And inside the container:

cd /data

podman build -t podman-compose-wait-test tests/integration/wait/
podman run -d --rm --healthcheck-command '["CMD-SHELL", "test -f /healthy"]' --health-interval=1s --health-retries=5 podman-compose-wait-test

It does not enter healthy state.

Btw, healthchecks do also not work if I execute Podman in Podman. I created an issue at the Podman project.

@mammo0
Copy link
Contributor Author

mammo0 commented Mar 4, 2026

@p12tic see containers/podman#28192 (comment) for the cause of the problem.

Since it seems that healthchecks are currently not possible in environments without systemd, how do we proceed with the test cases?

@p12tic
Copy link
Collaborator

p12tic commented Mar 12, 2026

Seems like we'll need to use something like https://github.com/gdraheim/docker-systemctl-replacement in the CI containers... And also implement timer support in that project (there's already an issue).

@p12tic
Copy link
Collaborator

p12tic commented Mar 12, 2026

I understand that this PR is becoming stuck on missing functionality in podman-compose test suite. I don't have expectation for you to fix all of that, though help would be welcome. I will do these improvements at some point when I have time.

Signed-off-by: mammo0 <marc.ammon@hotmail.de>
@mammo0 mammo0 force-pushed the feat_up_start--wait branch from a764bf1 to 5ae1c76 Compare March 14, 2026 15:46
@mammo0
Copy link
Contributor Author

mammo0 commented Mar 14, 2026

Hey, I've found a workaround for testing the wait functionality: During testing I use APScheduler for triggering the healthchecks periodically (like defined in the test docker-compose.yml file).

@p12tic please have a look at the latest commit 5ae1c76

Additionally I rebased this PR to the latest main branch.

get_podman_version() < version.parse("4.6.0"), "Wait feature not supported below 4.6.0."
)
class TestComposeWait(unittest.TestCase, RunSubprocessMixin):
# WORKAROUND for https://github.com/containers/podman/issues/28192
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best would be to spell things out explicitly. Something like "when running podman-compose tests in a container, systemd is unavailable. This works around by calling podman healthcheck periodically."

Also let's move this to separate mixin, e.g. RunHealthcheckMixin with e.g. start_healthcheck(...) function. start_healthcheck could setup self.scheduler. This way no setup or teardown is necessary.

Copy link
Contributor Author

@mammo0 mammo0 Mar 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can move the logic to a mixing. But I think we need the setup/teardown methods. Because otherwise when should start_healthcheck be called? At the beginning of every test method? And a stop_healthcheck method at the end?

Another possibility would be to use a context manager like I've done it with ExecutionTime. Then wrap the logic of the test methods inside the context manager. On enter, setup the scheduler and add the healthcheck job and on exit, remove the job and shutdown the scheduler.

But I think it is not very efficient to setup and shutdown the scheduler for every test method. On every start it launches a threadpool for executing the jobs and on shutdown it stops the threads again.

Therefore, I would like to keep some logic in the setUp and teardown methods. Aren't they meant for this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can move the logic to a mixing. But I think we need the setup/teardown methods. Because otherwise when should start_healthcheck be called? At the beginning of every test method? And a stop_healthcheck method at the end?

Yes, start_healthcheck could be called at the beginning of every test method. stop_healthcheck could be added via addCleanup.

Another possibility would be to use a context manager

I like the idea of context manager. This way entire logic is hidden in a single with setup_healthcheck(...) call.

Therefore, I would like to keep some logic in the setUp and teardown methods. Aren't they meant for this?

Unittest has the problem that tearDown is not called if setUp raises an exception. This is a problem if setUp becomes even a little bit complex, for example, by having 2 things that need to be torn down later. If the setup of the second thing crashes, then first thing will not be torn down.

Therefore registering cleanups via addCleanup is much better solution.

Additionally, in our case compose up can be considered the test body already, because most of the tested effects happen there. By symmetry compose down should be in the test as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think it is not very efficient to setup and shutdown the scheduler for every test method. On every start it launches a threadpool for executing the jobs and on shutdown it stops the threads again.

I benchmarked this:

def test():
     b = BackgroundScheduler()
     b.start()
     b.shutdown(wait=True)

This runs in around 0.5ms (timeit.timeit(lambda: test(), number=1000)). So compared to subprocess launches which we do a lot this doesn't matter much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, in our case compose up can be considered the test body already, because most of the tested effects happen there. By symmetry compose down should be in the test as well.

I changed it back, so compose down is executed within the test method.

Copy link
Contributor Author

@mammo0 mammo0 Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see f083a116 for context manager implementation

@p12tic
Copy link
Collaborator

p12tic commented Mar 15, 2026

@mammo0 Amazing workaround. I added several comments to make this workaround reusable.

# build the test image before starting the tests
# this is not possible in setUpClass method, because the run_subprocess_* methods are not
# available as classmethods
self.run_subprocess_assert_returncode([
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't up do build automatically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does.

But in the tests I'm monitoring the execution time of compose up with the ExecutionTime class. If compose up would first build the container and then starting it, there would be negative side effects on the execution time.

mammo0 added 5 commits March 16, 2026 07:21
Signed-off-by: mammo0 <marc.ammon@hotmail.de>
Signed-off-by: mammo0 <marc.ammon@hotmail.de>
Signed-off-by: mammo0 <marc.ammon@hotmail.de>
Signed-off-by: mammo0 <marc.ammon@hotmail.de>
Signed-off-by: mammo0 <marc.ammon@hotmail.de>
@mammo0 mammo0 force-pushed the feat_up_start--wait branch from 5ae1c76 to f083a11 Compare March 16, 2026 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants