Conversation
Runs a check of data.table CRAN status Mon/Wed/Fri at 6am every week.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #7020 +/- ##
=======================================
Coverage 98.69% 98.69%
=======================================
Files 79 79
Lines 14677 14678 +1
=======================================
+ Hits 14486 14487 +1
Misses 191 191 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
cool, thanks. |
Bisaloo
left a comment
There was a problem hiding this comment.
Thanks for the ping!
This is slightly different from my approach, described in my blog post about package co-maintenance. Here is how it compares in my opinion:
Pros of the approach proposed here
- It is written as a proper reusable action rather than a custom workflow, which means updates upstream can be propagated or at least proposed automatically in this repo.
- It uses a docker container so it might be slightly faster than my approach of using
r-lib/actions/setup-randr-lib/actions/setup-r-dependencies
Cons of the approach proposed here
-
In my experience, CRAN checks can sometimes have intermittent failures that would potentially cause a lot of noise in the repo.
I would recommend to focus on deadline for archival rather than failed checks. CRAN manually triggers these so you have some guarantee to avoid false positives.
-
I might have missed it but I don't see any system to prevent the workflow from opening a new issue each time it runs in case of persistent failures. I don't have a great solution for this. I have opted to automatically disable the workflow once it has opened an issue and the maintainers have to manually re-enable it when the close the issue.
The best way forward might be for me to contribute to the dieghernan/cran-status-check action to add support for a new "deadline" possible value for the statuses argument.
For now, if you want to go with this approach rather than mine, I have left some minor comments inline.
|
As a addendum, I have tested this workflow in a fork of the medrxivr repo (archival deadline on 2025-05-27) but for some reason, it's sometimes failing with HTTP 429 errors. Something about rate limiting but I'm not clear why this is happening. https://github.com/Bisaloo/medrxivr/actions/runs/15250264104/job/42885349671 |
|
great thanks Hugo.
I'm not sure I understand how to automatically disable and then manually re-enable the workflow? Can you please clarify? |
|
Automatically disabling the workflow can be done via the GitHub API so we can make it part of the workflow that opens the issue. This is how I do it in R: but you could also do it from the command line with the gh workflow disable workflow_nameBut the question is then: when is it safe to re-enable it? When should it start creating new issues? Because I couldn't figure out a criterion to implement, I didn't. Note that the code part is definitely doable. We just need to find a good criterion. This is why for now, you have to manually re-enable the workflow. Does this answer your question? |
|
Thanks for the explanation! You use the gh API to disable the workflow, so that it only posts one github issue per CRAN issue. And then you follow the instructions in the github docs to re-enable the workflow, when the issue has been resolved. That makes sense, thanks for linking the github docs, I did not know about the concept of "enabled" vs "disabled" workflows before this discussion. In atime performance test action, https://github.com/Anirban166/Autocomment-atime-results/blob/main/action.yml The difference is that we want to run the CRAN check action every day (not every push to PR), so we don't have an associated PR/issue to comment on. Perhaps we can use a standard name for the issue? for example "CRAN check issues must be resolved by 2025-05-26" and then when you re-run the action, you search for an issue with the given name? If there is no issue with the given name/deadline date then you can post a new issue. |
|
One other solution would be to always re-use the same issue and the workflow only re-opens this issue as needed. This would be similar to what I do in my project to track CRAN deadlines across an organization. See for example epiverse-trace/etdashboard#12. If we stay with the idea of having a different issue for each new deadline, a slightly more performant solution than matching on the title might be to have a dedicated label for this type of issue. Then, we could say that only one issue with this label can be opened at a given time. The issue with matching the title is that we have to get all issues and then match on the title, while GitHub API allows server-side filtering by label: https://docs.github.com/en/rest/issues/issues?apiVersion=2022-11-28#list-repository-issues. This could also be done early in the workflow to even skip the CRAN status checking if an issue is already open. What do you think? |
|
thanks! server-side filtering by label sounds like an efficient solution. |
Co-authored-by: Hugo Gruson <10783929+Bisaloo@users.noreply.github.com>
Co-authored-by: Hugo Gruson <10783929+Bisaloo@users.noreply.github.com>
|
@Bisaloo I added a draft of a step that would check for "cran-deadline" label in open issues but I'm fairly new to more complex GH actions so feel free to change the approach. Wasn't sure about adjustments to the actual CRAN check though. |
|
Hugo I invited you to be a project member, so you now have permissions to write and push branches like this one -- please edit as you think appropriate. |
|
Is the current version worth merging? We can always iterate later with follow-up issues. |
|
If @Bisaloo thinks it should be good to go, I think we can merge. I think we can try it and see if it is helpful at all. |
|
My apologies for the delay. I just want to test it on a test repo. I'll do this by the end of the week. |
| run: | | ||
| # Count open issues with CRAN-related labels | ||
| ISSUE_COUNT=$(gh issue list --label "cran-deadline" --state open --json number | jq length) | ||
| if [ $ISSUE_COUNT -eq 0 ] || [ $ARCHIVE_ISSUES -gt 0 ]; then |
There was a problem hiding this comment.
This $ARCHIVE_ISSUES var doesn't exist
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Check for existing CRAN issues | ||
| id: check-issues |
There was a problem hiding this comment.
The GITHUB_TOKEN env var needs to be set to use the gh cli utility
| id: check-issues | |
| env: | |
| GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | |
| id: check-issues |
| uses: dieghernan/cran-status-check@v2 | ||
| with: | ||
| create-issue: "true" | ||
| labels: "cran-deadline" |
There was a problem hiding this comment.
This argument doesn't exist in dieghernan/cran-status-check. It does indeed exist in peter-evans/create-issue-from-file which is used under the hood, but arguments from dieghernan/cran-status-check are not passed, excepted assignees.
A PR to dieghernan/cran-status-check would be required, or a different strategy (e.g., the automatically disable when creating issue and require manual re-enable) should be used altogether.
* Add cran-status-check workflow Fix #7008 Supersedes #7020 * Add NOTE in NEWS for this change * Exit early to reduce nesting * Add link to workflow run in issue body * nit: consistent spacing around status emoji * simplify (IMO) * use subset() to avoid temp crandb --------- Co-authored-by: Michael Chirico <chiricom@google.com>
|
Shall we close this and start afresh with further improvement, or just mark as draft until the upstream repo has the required feature? |
|
Closing for now |
Closes #7008
Runs a check of data.table CRAN status Mon/Wed/Fri at 6am every week.