Skip to content

podman: add livecheck, update stable url#205162

Merged
BrewTestBot merged 2 commits intomasterfrom
livecheck-podman
Jan 23, 2025
Merged

podman: add livecheck, update stable url#205162
BrewTestBot merged 2 commits intomasterfrom
livecheck-podman

Conversation

@timsutton
Copy link
Contributor

@timsutton timsutton commented Jan 22, 2025

  • Have you followed the guidelines for contributing?
  • Have you ensured that your commits follow the commit style guide?
  • Have you checked that there aren't other open pull requests for the same formula update/change?
  • Have you built your formula locally with HOMEBREW_NO_INSTALL_FROM_API=1 brew install --build-from-source <formula>, where <formula> is the name of the formula you're submitting?
  • Is your test running fine brew test <formula>, where <formula> is the name of the formula you're submitting?
  • Does your build pass brew audit --strict <formula> (after doing HOMEBREW_NO_INSTALL_FROM_API=1 brew install --build-from-source <formula>)? If this is a new formula, does it pass brew audit --new <formula>?

podman can push a tag for some time before the official release.

This livecheck aims to capture the version from the href that will link out to GH releases, like this:

CleanShot 2025-01-22 at 08 48 17@2x

My first time doing any livecheck additions, LMK if there's anything unusual I'm doing here!

@github-actions github-actions bot added go Go use is a significant feature of the PR or issue rust Rust use is a significant feature of the PR or issue labels Jan 22, 2025
@timsutton timsutton added livecheck Issues or PRs related to livecheck CI-syntax-only Change only affects brew syntax, not the install. Only run syntax CI. and removed go Go use is a significant feature of the PR or issue rust Rust use is a significant feature of the PR or issue labels Jan 22, 2025
@timsutton timsutton marked this pull request as ready for review January 22, 2025 13:54
@timsutton timsutton mentioned this pull request Jan 22, 2025
1 task
@chenrui333 chenrui333 requested a review from samford January 22, 2025 14:19
Copy link
Member

@samford samford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the issue is only that there can be a notable span of time between when a version is tagged and when the corresponding GitHub release is created, we can simply use the GithubLatest strategy. I've added suggestions to bring this in line with similar checks.

Upstream created a 5.3.2 release less than an hour ago (~18 hours after the tag) but the homepage is still referring to 5.3.1. If we specifically need to match the version from the homepage instead of using the latest GitHub release, then I can add a different suggestion to bring the regex up to standard.

@timsutton
Copy link
Contributor Author

timsutton commented Jan 22, 2025

If we specifically need to match the version from the homepage instead of using the latest GitHub release, then I can add a different suggestion to bring the regex up to standard.

Thanks for the review! My main motivation was to have livecheck consider the new version only based on what's on the website. It looks from containers/podman.io#364 like the website will similarly be not far behind a GH release, but that it's not totally automated.

I see autobump PRs (for various formulae) where folks will hold off and consider it a prerelease if the website hasn't yet updated.. and then need to check back on it later.

If we still want to consider the website as the best source of truth, I figured that this might be preferable over using the GithubLatest. And also that this question usually inovlves a tradeoff of "GH releases will be more stable to check than a webpage, but can lag behind the webpage content." What do you think is preferable in this case?

@SMillerDev
Copy link
Member

I see autobump PRs (for various formulae) where folks will hold off and consider it a prerelease if the website hasn't yet updated.. and then need to check back on it later.

Yeah, we really need to make a decision what to do there. It was my understanding that we stopped doing GH latest because of rate limits. That's why I thought we should start using websites instead.

@timsutton
Copy link
Contributor Author

Yeah, we really need to make a decision what to do there. It was my understanding that we stopped doing GH latest because of rate limits. That's why I thought we should start using websites instead.

That's partly where I came from as well.. our docs suggest to not use them unless we really need to, and it looks to me like in the case of podman, the website livecheck is feasible.

@ashley-cui
Copy link
Contributor

ashley-cui commented Jan 22, 2025

FWIW the Podman team considers the upstream GH release as the official "release" time, regardless of when the tag is pushed or the website is updated. Although we do guarantee the website will be updated after the upstream release.

@samford
Copy link
Member

samford commented Jan 22, 2025

FWIW the Podman team considers the upstream GH release as the official "release" time, regardless of when the tag is pushed or the website is updated. Although we do guarantee the website will be updated after the upstream release.

Thanks! I was going to open an upstream issue to ask for clarification of the release process, so you've saved me the trouble.


Short version: It sounds like GithubLatest is the way to go here. I updated my suggestions to clarify the situation in the comment.

Long version:

our docs suggest to not use them unless we really need to, and it looks to me like in the case of podman, the website livecheck is feasible.

We only use GithubLatest when necessary but we have a handful of formulae using a tag (or tag tarball) where we use it to address this particular issue (a gap between tag and release), so this is one of the appropriate use cases (I had it in mind while writing the docs). We just make sure to include a preceding comment to explain why it's necessary (often a variation of the suggested comment), as that allows us to quickly understand the context when making changes in the future.

Checking the upstream website is another way to address this situation and it can make sense if the website is kept up to date and they link to GitHub (ideally a specific release). If GithubLatest works, we usually only opt for the upstream website if they make it clear that a version isn't released until the website is updated or if GithubReleases is necessary and the website provides the same version information using a lighter check. I try to avoid using GithubLatest when possible (especially GithubReleases) but not in this scenario and not if the alternative may be less reliable (or require more maintenance over time).

The reason why I suggested GithubLatest here is because it may be more reliable than the regex that's required to check the homepage. The homepage HTML only provides the Podman CLI version in the inner text of the a element that you mentioned (i.e., the download links are programmatically produced from a JavaScript file with a hash filename) and the necessary regex is more fragile than I would like. Small HTML changes can cause it to break if we make it too explicit but it may match an inappropriate/unrelated version if we make it too loose. We're always doing a balancing act with regexes but I try to avoid matching loose text when possible, as it can be less reliable (i.e., most of our PageMatch checks are matching versions from URLs, filenames, etc. and that's usually less of an issue). [It may be a different story if the homepage linked to the versioned release (e.g., https://github.com/containers/podman/releases/tag/v5.3.2), as we could easily match it with something like %r{href=.*?/podman/releases/(?:tag/)?v?(\d+(?:\.\d+)+)["' >]/i.]

I see autobump PRs (for various formulae) where folks will hold off and consider it a prerelease if the website hasn't yet updated.. and then need to check back on it later.

When maintainers leave a comment like "Not the latest release yet" for a formula that's using a GitHub tag or tag tarball and the repository uses GitHub releases, that's often because livecheck is surfacing a new version from a Git tag but the GitHub release hasn't been created yet. Even though the formula only uses a tag or tag tarball, we wait for the release because some upstream projects feel comfortable retagging a version before the release is created and we would end up with a checksum mismatch.

In a perfect world, we would have livecheck check GitHub releases for all projects that use them but that could lead to rate limit issues and break CI (the token is used for more than livecheck) or lead to issues when running lots of checks locally (e.g., brew livecheck --tap ...). Some projects also create releases for other sub-projects (where "latest" may sometimes be for different software), so we would have to check multiple releases for some projects and we really try to avoid GithubReleases because those responses are fairly heavy.


Yeah, we really need to make a decision what to do there. It was my understanding that we stopped doing GH latest because of rate limits. That's why I thought we should start using websites instead.

We've gone through a few different phases over the years but we've been pretty consistent about only using GithubLatest when it's necessary/appropriate since ~2020. There was a brief period of time in 2020 where some livecheck blocks were added that checked GitHub releases even though it wasn't necessary but we reassessed and decided to limit that approach. Back then we were dealing with rate limit issues on homebrew-livecheck CI even for github.com release pages, so we only used that approach when Git wasn't sufficient.

Eventually we introduced the GithubLatest strategy to replace related livecheck blocks (using the same logic) and we updated those to use strategy :github_latest. We still have a small number of old livecheck blocks using GithubLatest when it isn't necessary and someday I'll find/remove those (and document the ones that are necessary) but there are other things higher up my to-do list right now.

I haven't seen rate limit issues when checking github.com release pages for years but now GithubLatest/GithubReleases check the GitHub API (which has an explicit rate limit), so we've continued to minimize use of those strategies. It's something we can reassess (e.g., we may want to use GithubLatest/GithubReleases for all formulae/casks using a GitHub release asset) but it may not be worth the overhead for packages that aren't autobumped (where we want greater reliability from livecheck). It's something we would need to address if we ever want to make autobump opt-out (rather than opt-in) but it would take some work (some projects are more consistent about releases than others).

@SMillerDev
Copy link
Member

We've gone through a few different phases over the years but we've been pretty consistent about only using GithubLatest when it's necessary/appropriate since ~2020.

From what I see in core most maintainers just check the releases manually because they're worried about using the latest livecheck with the way the documentation is written.

If we can add more of them, I think the documentation should reflect that. If we can't, we should find a way to work around that.

@timsutton
Copy link
Contributor Author

timsutton commented Jan 23, 2025

Thanks for taking the time to explain all these details :)

I try to avoid using GithubLatest when possible (especially GithubReleases) but not in this scenario and not if the alternative may be less reliable (or require more maintenance over time).

Given that, and I'm in agreement that the webpage livecheck is more brittle (and maybe moreso in the current form). It sounds like GitHub releases are generally treated as canonical and that in this case the GithubLatest check suffices since the ordering is predictable, and so that this case it fits the convention used in other formulae.

I see for podman specifically there can be some -RC1 named GitHub releases for more significant versions, but I imagine those could be dealt with as followups in the livecheck easily enough, if they turned up matches.

Maybe there is a separate side discussion to have about whether we could track API ratelimit usage over time or do some spot-checks to see whether or not there is much wiggle room (IOW, whether there is sufficient headroom or if it should be more of a concern). When I get a chance tomorrow I'll fix up this PR with those changes, and from what I understand in switching the livecheck strategy we could also then set the stable url to "https://github.com/containers/podman/archive/refs/tags/v5.3.2.tar.gz" instead of a git clone.

@samford
Copy link
Member

samford commented Jan 23, 2025

I see for podman specifically there can be some -RC1 named GitHub releases

Upstream marks those as pre-release, so we don't have to worry about them being the "latest" release (provided upstream remains consistent).


If we can add more of them, I think the documentation should reflect that. If we can't, we should find a way to work around that.

The situation in this PR is covered by the existing documentation (i.e., we're using GithubLatest because Git isn't sufficient) and I had this in mind when I was writing it, as we were already using this approach to address this scenario at the time. Let's hold off on updating the documentation until we change how we're using the GitHub strategies.

I agree that using GithubLatest is ideal when a project actively uses releases but we first need to ensure that it won't become a problem for CI before using it more widely. If I'm not mistaken, livecheck shares the same rate limit with the other GitHub API requests on CI, so if we hit the rate limit because of livecheck it may cause general issues for CI (or vice versa). It would be ideal if livecheck had a separate rate limit but I haven't looked into whether that's possible.

We should also make sure that livecheck, bump, and related tooling/workflows don't make API requests after exceeding the rate limit. Currently Utils::GitHub provides an error if it exceeds the rate limit but I don't think we do anything in response to it. Eventually it would be nice to have a setup where we could run requests later if the rate limit resets during execution but that would take some work (e.g., we would need some sort of queue), so adding logic to automatically skip GitHub API checks if the rate limit has been exceeded would be a good stopgap measure.

In general, using GithubLatest more widely would save maintainers some work in reviewing but it may require more work in maintaining related livecheck blocks, as some projects are messy/inconsistent with releases. Checking related formulae to compare the output from GithubLatest to Git (and doing this a few times over a period of time to catch inconsistencies) may give us a rough idea of the state of things. I started working on a script to do this sometime last year but I didn't finish it, so I'll have to pick it up again sometime.

I have some ideas on how we can potentially reduce the number of GitHub API requests that livecheck makes on CI but those ideas would take some notable work to implement (we have to lay some groundwork to make it possible), so it's not something that will change in the immediate-term. If we want to go down this path, I can create an issue to discuss this in more detail, as this is beyond the scope of this PR (and continuing the discussion here would make it hard to find in the future).

In the near-term, I agree with Tim that we should look into how much of the rate limit we're typically using at the moment. It would also be great to know how much is attributable to livecheck, if possible.

@SMillerDev
Copy link
Member

I think what I mostly want to know is: next time a maintainer finds that a release is a pre-release. Are they free to change the livecheck to GitHub latest to avoid having to check manually in the future, yes or no?

@github-actions github-actions bot added go Go use is a significant feature of the PR or issue rust Rust use is a significant feature of the PR or issue labels Jan 23, 2025
@samford
Copy link
Member

samford commented Jan 23, 2025

I think what I mostly want to know is: next time a maintainer finds that a release is a pre-release. Are they free to change the livecheck to GitHub latest to avoid having to check manually in the future, yes or no?

It depends. If a formula has a stable URL using a GitHub tag or tag archive, it's fine to use GithubLatest if the gap between a tag and release is regularly more than an hour (e.g., longer than the autobump interval). In those situations, we add a standard comment (or a variation of it) before the livecheck block to document why GithubLatest is used even though the formula isn't using a GitHub release asset:

  # There can be a notable gap between when a version is tagged and a
  # corresponding release is created, so we check the "latest" release instead
  # of the Git tags.
  livecheck do
    url :stable
    strategy :github_latest
  end

We could consider revising this approach (or making exceptions) if there are formulae where we regularly run into this issue and the gap is less than an hour (but ideally still more than 30 minutes). An hour is just a general guideline, so we don't overuse the GitHub strategies (for now) simply because there's a short gap between tag/release and the PR happens to be created shortly after the version is tagged or the current release is slower than usual.

When it comes to formulae where the stable URL is a GitHub release asset (i.e., it includes /releases/download/ in the URL, like abyss), we add a GithubLatest livecheck block when we run into this issue but we don't need an explanatory comment because checking releases is the correct thing to do for those formulae.


This is something we've been doing since ~2020 but I haven't documented it partly because I didn't want it to be misinterpreted as an invitation to use the GitHub strategies broadly (i.e., until we're in a position where we can do so without it affecting CI, using them too little is safer than too much). I also initially received feedback about the livecheck documentation being long (and doing too much handholding) when I created it, so the brew livecheck doc includes less information than I would like. As a result, I've received feedback from both contributors and maintainers over the years about how including more information would be helpful.

Rui may be the only homebrew/core maintainer (besides myself) who creates this type of livecheck block on a regular basis (we have some others who've done so less frequently), so this admittedly may not be common knowledge among maintainers/contributors. Rui's always watching version updates, so he often sees when livecheck surfaces a new version or there's a version bump PR but there isn't a release for it yet. When there's a notable gap between tag and release, he either opens a PR to add/update a livecheck block or incorporates it into the version bump PR and requests my review.


For some stats:

  • 4,294 formulae have a github.com stable URL, 426 of which use GithubLatest and 19 use GithubReleases
  • 3,393 of the github.com formulae use a tag tarball, 267 of which use GithubLatest and 18 use GithubReleases
  • 639 use a GitHub release asset, 120 of which use GithubLatest and 5 use GithubReleases
  • 253 use a Git repository with a tag/revision, 39 of which use GithubLatest and 4 use GithubReleases
  • 4 use a commit tarball (e.g., luajit)
  • 4 use a Git repository revision but no tag (e.g., colortail). [I'm working on PRs to migrate these to use a commit tarball like luajit does]
  • 1 uses a tarball from within a GitHub repository's files (sshtrix)
  • We also have 7 formulae that don't have a github.com stable URL but use a GitHub strategy as a workaround and most of those have an explanatory comment before the livecheck block (I'll be adding explanations to those that don't)

If we look into our rate limit usage and find that we always have more than enough to spare, we could consider adding GithubLatest (or GithubReleases) livecheck blocks to the 514 formulae that use a GitHub release asset but aren't using a GitHub strategy yet. Those are the first ones that I want to address, as we don't want livecheck to surface a version until the release is available. That has always been true but we've only added livecheck blocks when Git hasn't been sufficient or there's a delay in the release (so far that's only 125).

When we go down that path, I could either handle those all at once (making it easier to revert if we run into rate limits) or roll them out in smaller batches (e.g., 100 at a time) with a day or more between PRs, so we can watch how the changes affect the rate limit before proceeding to the next batch. If we run into rate limit issues, we'll have to defer it until after we've made changes to improve the rate limit situation on CI. This would more than double the number of related checks, so a batched approach is likely better.

To be clear, some of those require more than just strategy :github_latest. I worked on this last year and I have a local branch with all of those changes, so I've rebased it and will make sure it's up to date.

@samford samford removed the CI-syntax-only Change only affects brew syntax, not the install. Only run syntax CI. label Jan 23, 2025
@SMillerDev
Copy link
Member

In those situations, we add a standard comment (or a variation of it) before the livecheck block to document why GithubLatest is used even though the formula isn't using a GitHub release asset

Except we don't, see for example: #205107 and I know there are a bunch more like this.

I'm also skipping over parts of your answer because I simply don't have time to read everything right now.

@samford
Copy link
Member

samford commented Jan 23, 2025

Except we don't, see for example: #205107 and I know there are a bunch more like this.

I saw tailwindcss when I was digging up outliers and I have a commit lined up to add an explanatory comment but point taken. I didn't intend to suggest that all livecheck blocks actively adhere to those guidelines because that's not the case and never has been. Of the formulae with a stable URL using a GitHub tag or tag archive and a GitHub strategy, only 125 out of 323 have a preceding comment. Some of those are old livecheck blocks from ~2020 (and before) where we don't need to be using GithubLatest but some of those were added later and not documented.

Though we have some maintainers who are vaguely aware of those guidelines, they're not documented and only loosely followed or enforced. As always, we can't expect adherence unless there's documentation and Rubocops. I'm working on expanding livecheck Rubocops and I should be able to add one to ensure there's an explanatory comment before the livecheck block if stable is a github.com tag or tag archive and the livecheck block uses GithubLatest or GithubReleases. I have some in-progress work that I need to finish first (sharing livecheck Rubocops between formulae/resources and casks) but I'll add it to the to-do list for after.

@samford samford changed the title podman: add livecheck podman: add livecheck, update stable url Jan 23, 2025
@github-actions
Copy link
Contributor

🤖 An automated task has requested bottles to be published to this PR.

@github-actions github-actions bot added the CI-published-bottle-commits The commits for the built bottles have been pushed to the PR branch. label Jan 23, 2025
@BrewTestBot BrewTestBot enabled auto-merge January 23, 2025 19:31
@BrewTestBot BrewTestBot added this pull request to the merge queue Jan 23, 2025
Merged via the queue into master with commit 7b52926 Jan 23, 2025
15 checks passed
@BrewTestBot BrewTestBot deleted the livecheck-podman branch January 23, 2025 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-published-bottle-commits The commits for the built bottles have been pushed to the PR branch. go Go use is a significant feature of the PR or issue livecheck Issues or PRs related to livecheck rust Rust use is a significant feature of the PR or issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants