Skip to content

Conversation

@winterqt
Copy link
Contributor

By default, systemd sets a limit of how many times a service can start, which means that if you have a healthcheck that runs more often than the limits, systemd will refuse to start it with a message like "Start request repeated too quickly." emitted to the journal.

I'm not familiar enough with the e2e test suite to confidently write a test for this -- hints appreciated! :)

Does this PR introduce a user-facing change?

Fixed a bug that prevented healthchecks from running consistently on time on Linux.

@mheon
Copy link
Member

mheon commented Oct 20, 2025

Good change, code LGTM
/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mheon, winterqt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 20, 2025
@mheon mheon added 5.7 and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 20, 2025
@mheon
Copy link
Member

mheon commented Oct 20, 2025

For testing, I would recommend doing something like https://github.com/containers/podman/blob/main/test/e2e/healthcheck_run_test.go#L59-L76 with a shorter interval and a runtime long enough to convince systemd to fail (or, with your patch, not fail)

@Luap99
Copy link
Member

Luap99 commented Oct 20, 2025

For testing, I would recommend doing something like https://github.com/containers/podman/blob/main/test/e2e/healthcheck_run_test.go#L59-L76 with a shorter interval and a runtime long enough to convince systemd to fail (or, with your patch, not fail)

test/e2e does never run healtchecks via systemd timers, they get disabled there via env var.
This would need to go into test/system/220-healthcheck.bats. But keep in mind any timing related test will be flaky so it would need to be done in a way that it doesn't depend on timings, i.e. sleep or other waiting commands ideally.

@winterqt
Copy link
Contributor Author

Unfortunately having some trouble running the system tests on my work machine :(

Can try to write a test and run it through CI if y'all would be fine with that?

@packit-as-a-service
Copy link

Cockpit tests failed for commit 191bbff. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service
Copy link

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@mheon
Copy link
Member

mheon commented Oct 20, 2025

Sure, if you can't get the tests to run

@github-actions
Copy link

A friendly reminder that this PR had no activity for 30 days.

By default, systemd sets a limit of how many times a service can start,
which means that if you have a healthcheck that runs more often than the
limits, systemd will refuse to start it with a message like "Start request
repeated too quickly." emitted to the journal.

Signed-off-by: Winter M <[email protected]>
Signed-off-by: Paul Holzinger <[email protected]>
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 26, 2025
@Luap99
Copy link
Member

Luap99 commented Nov 26, 2025

I did amend the commit with a small fix to verify the systemd unit setting, actually running the test on a short interval in CI and writing something flake free seemed impossible.

Also your code change didn't quite work because the argument must go before the podman binary otherwise the argument was passed to the podman command not systemd.

But I did verify locally this works now.

@github-actions github-actions bot removed the stale-pr label Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5.7 approved Indicates a PR has been approved by an approver from all required OWNERS files. release-note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants