Skip to content

Create cran-status-check.yml#7020

Closed
TysonStanley wants to merge 5 commits intomasterfrom
cran-archive-workflow
Closed

Create cran-status-check.yml#7020
TysonStanley wants to merge 5 commits intomasterfrom
cran-archive-workflow

Conversation

@TysonStanley
Copy link
Member

Closes #7008

Runs a check of data.table CRAN status Mon/Wed/Fri at 6am every week.

Runs a check of data.table CRAN status Mon/Wed/Fri at 6am every week.
@codecov
Copy link

codecov bot commented May 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.69%. Comparing base (8f4d89c) to head (41f05c8).
Report is 137 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7020   +/-   ##
=======================================
  Coverage   98.69%   98.69%           
=======================================
  Files          79       79           
  Lines       14677    14678    +1     
=======================================
+ Hits        14486    14487    +1     
  Misses        191      191           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tdhock
Copy link
Member

tdhock commented May 26, 2025

cool, thanks.
Hugo @Bisaloo can you please review?

Copy link
Contributor

@Bisaloo Bisaloo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping!

This is slightly different from my approach, described in my blog post about package co-maintenance. Here is how it compares in my opinion:

Pros of the approach proposed here

  • It is written as a proper reusable action rather than a custom workflow, which means updates upstream can be propagated or at least proposed automatically in this repo.
  • It uses a docker container so it might be slightly faster than my approach of using r-lib/actions/setup-r and r-lib/actions/setup-r-dependencies

Cons of the approach proposed here

  • In my experience, CRAN checks can sometimes have intermittent failures that would potentially cause a lot of noise in the repo.

    I would recommend to focus on deadline for archival rather than failed checks. CRAN manually triggers these so you have some guarantee to avoid false positives.

  • I might have missed it but I don't see any system to prevent the workflow from opening a new issue each time it runs in case of persistent failures. I don't have a great solution for this. I have opted to automatically disable the workflow once it has opened an issue and the maintainers have to manually re-enable it when the close the issue.


The best way forward might be for me to contribute to the dieghernan/cran-status-check action to add support for a new "deadline" possible value for the statuses argument.

For now, if you want to go with this approach rather than mine, I have left some minor comments inline.

@Bisaloo
Copy link
Contributor

Bisaloo commented May 26, 2025

As a addendum, I have tested this workflow in a fork of the medrxivr repo (archival deadline on 2025-05-27) but for some reason, it's sometimes failing with HTTP 429 errors. Something about rate limiting but I'm not clear why this is happening.

https://github.com/Bisaloo/medrxivr/actions/runs/15250264104/job/42885349671

@tdhock
Copy link
Member

tdhock commented May 26, 2025

great thanks Hugo.
I appreciate your insights about the pros and cons, that is super helpful.
I like your approach about avoiding false positives, by focusing on deadline for archival rather than failed checks.

prevent the workflow from opening a new issue each time it runs in case of persistent failures. I don't have a great solution for this. I have opted to automatically disable the workflow once it has opened an issue and the maintainers have to manually re-enable it when the close the issue

I'm not sure I understand how to automatically disable and then manually re-enable the workflow? Can you please clarify?

@Bisaloo
Copy link
Contributor

Bisaloo commented May 26, 2025

Automatically disabling the workflow can be done via the GitHub API so we can make it part of the workflow that opens the issue.

This is how I do it in R:

https://github.com/Bisaloo/bisaloo.github.io/blob/fec110ab1d4be89aa7b6a7731c69cd7e28d6f34b/posts/R-package-comaintenance/actions/check-pkg-deadline.yaml#L41-L46

but you could also do it from the command line with the gh CLI utility (pre-installed on GHA default runners):

gh workflow disable workflow_name

But the question is then: when is it safe to re-enable it? When should it start creating new issues?

Because I couldn't figure out a criterion to implement, I didn't. Note that the code part is definitely doable. We just need to find a good criterion.

This is why for now, you have to manually re-enable the workflow.

Does this answer your question?

@tdhock
Copy link
Member

tdhock commented May 26, 2025

Thanks for the explanation! You use the gh API to disable the workflow, so that it only posts one github issue per CRAN issue. And then you follow the instructions in the github docs to re-enable the workflow, when the issue has been resolved. That makes sense, thanks for linking the github docs, I did not know about the concept of "enabled" vs "disabled" workflows before this discussion.

In atime performance test action, https://github.com/Anirban166/Autocomment-atime-results/blob/main/action.yml
we run it in the PR, and the first run creates a comment in that PR, and subsequent runs update that comment (instead of creating a new comment which would be redundant). This seems similar to what we want to do with CRAN checks (only one comment per CRAN issue, maybe use archival deadline date as the ID?) please ask @Anirban166 if you have questions about how he implemented the comment updating.

The difference is that we want to run the CRAN check action every day (not every push to PR), so we don't have an associated PR/issue to comment on. Perhaps we can use a standard name for the issue? for example "CRAN check issues must be resolved by 2025-05-26" and then when you re-run the action, you search for an issue with the given name? If there is no issue with the given name/deadline date then you can post a new issue.

@Bisaloo
Copy link
Contributor

Bisaloo commented May 26, 2025

One other solution would be to always re-use the same issue and the workflow only re-opens this issue as needed. This would be similar to what I do in my project to track CRAN deadlines across an organization. See for example epiverse-trace/etdashboard#12.

If we stay with the idea of having a different issue for each new deadline, a slightly more performant solution than matching on the title might be to have a dedicated label for this type of issue. Then, we could say that only one issue with this label can be opened at a given time. The issue with matching the title is that we have to get all issues and then match on the title, while GitHub API allows server-side filtering by label: https://docs.github.com/en/rest/issues/issues?apiVersion=2022-11-28#list-repository-issues. This could also be done early in the workflow to even skip the CRAN status checking if an issue is already open.

What do you think?

@tdhock
Copy link
Member

tdhock commented May 26, 2025

thanks! server-side filtering by label sounds like an efficient solution.

TysonStanley and others added 3 commits May 26, 2025 18:41
Co-authored-by: Hugo Gruson <10783929+Bisaloo@users.noreply.github.com>
Co-authored-by: Hugo Gruson <10783929+Bisaloo@users.noreply.github.com>
@TysonStanley
Copy link
Member Author

@Bisaloo I added a draft of a step that would check for "cran-deadline" label in open issues but I'm fairly new to more complex GH actions so feel free to change the approach. Wasn't sure about adjustments to the actual CRAN check though.

@tdhock
Copy link
Member

tdhock commented May 27, 2025

Hugo I invited you to be a project member, so you now have permissions to write and push branches like this one -- please edit as you think appropriate.
If you accept, please submit any future PRs from this repo (rather than your fork).
Thanks!!

@MichaelChirico
Copy link
Member

Is the current version worth merging? We can always iterate later with follow-up issues.

@TysonStanley
Copy link
Member Author

If @Bisaloo thinks it should be good to go, I think we can merge. I think we can try it and see if it is helpful at all.

@Bisaloo
Copy link
Contributor

Bisaloo commented Jul 1, 2025

My apologies for the delay. I just want to test it on a test repo. I'll do this by the end of the week.

run: |
# Count open issues with CRAN-related labels
ISSUE_COUNT=$(gh issue list --label "cran-deadline" --state open --json number | jq length)
if [ $ISSUE_COUNT -eq 0 ] || [ $ARCHIVE_ISSUES -gt 0 ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This $ARCHIVE_ISSUES var doesn't exist

uses: actions/checkout@v4

- name: Check for existing CRAN issues
id: check-issues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GITHUB_TOKEN env var needs to be set to use the gh cli utility

Suggested change
id: check-issues
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
id: check-issues

uses: dieghernan/cran-status-check@v2
with:
create-issue: "true"
labels: "cran-deadline"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This argument doesn't exist in dieghernan/cran-status-check. It does indeed exist in peter-evans/create-issue-from-file which is used under the hood, but arguments from dieghernan/cran-status-check are not passed, excepted assignees.

https://github.com/dieghernan/cran-status-check/blob/8891728fe5bd8899252718ba9c7ba73a93ae37da/action.yml#L117-L123

A PR to dieghernan/cran-status-check would be required, or a different strategy (e.g., the automatically disable when creating issue and require manual re-enable) should be used altogether.

Bisaloo added a commit that referenced this pull request Jul 15, 2025
MichaelChirico added a commit that referenced this pull request Jul 15, 2025
* Add cran-status-check workflow

Fix #7008

Supersedes #7020

* Add NOTE in NEWS for this change

* Exit early to reduce nesting

* Add link to workflow run in issue body

* nit: consistent spacing around status emoji

* simplify (IMO)

* use subset() to avoid temp crandb

---------

Co-authored-by: Michael Chirico <chiricom@google.com>
@MichaelChirico
Copy link
Member

Shall we close this and start afresh with further improvement, or just mark as draft until the upstream repo has the required feature?

@MichaelChirico
Copy link
Member

Closing for now

@MichaelChirico MichaelChirico deleted the cran-archive-workflow branch July 22, 2025 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

github actions for CRAN archival

4 participants