Conversation
✅ Deploy Preview for ethereumorg ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
.github/workflows/lychee-cron.yml
Outdated
| on: | ||
| pull_request: |
There was a problem hiding this comment.
having this here allows for the initial action to trigger, once/if this pr is approved, im going to remove this so only manual trigger and the scheduled monthly cron task on the 1st of the month are the triggers for the task to run.
to run the pull request action on the main ethereum-org, it would need to be approved as of writing this comment
.github/workflows/lychee-cron.yml
Outdated
| args: | | ||
| public/ | ||
| --quiet | ||
| --max-retries 1 | ||
| --accept 200,429,403 | ||
| --exclude-all-private | ||
| --exclude '^file://' | ||
| --exclude "ethereum\.org" | ||
| --include '^https?://' | ||
| --format detailed | ||
| './**/*.md' | ||
| './**/*.html' |
There was a problem hiding this comment.
this is the main work of the pr, let me breakdown what these args do:
max-retries 1- try to cut down on rate limiting errors (429) for links across site that share a domain (default is 3)accept 200,429,403- forces accepting 429 and 403 errors as successful - if you want to compare the difference this makes, here is the report without this change vs the report with this change - ie 14282 errors vs 5437 errors.- 429 - rate limits - rate limits get hit understandably when testing every link on the site in this job, accepting the rate limits basically ignores them, doing quick tests, a majority of the links that get rate limited are github and discord links. For github, most of those links actually work, discord is give or take, still, ignoring them for the practice of this cron job seems reasonable to reduce the noise..
- 403 - unauthorized - if lychee is unauthorized, its most likely that its behind a login or other reasons regarding robot rules, plus most of the 403 that show up in the report work anyways when spot checked. ignoring them by accepting them seems reasonable and reduces the noise..
--exclude-all-private- ignore relative links (ie etheruem org links)- exlude
file://- lychee cannot access the files so ignoring.. - exclude
ethereum.orglinks - self explainatory - include
'^https?://'- i think i was having trouble with certain relative links that were falsely showing up on the report, so adding this hopefuly makes sure that its an external link --format detailed- gives you the summary report on the action summary*md/*html- only search markdown and html files, just being explicit here
|
This issue is stale because it has been open 30 days with no activity. |
monthly cron job, manually dispatchable
wackerow
left a comment
There was a problem hiding this comment.
Looks good @brossetti1! Thanks a lot! Sorry for delay here, but looks like this is outputting just fine in the action logs.
I removed the on pull_request trigger in preparation to merge, and also remove the .html match, since we don't really have any and shouldn't bother potentially matching them. We could consider also looking into scanning through the src/intl .json files... may need another approach since it's not served from public, but maybe not. Either way can address that separately.
Bringing in!
|
@all-contributors please add @brossetti1 for code |
|
I've put up a pull request to add @brossetti1! 🎉 |

Add Lychee monthly report cron job
Description
Setting up a basic monthly run of lychee link checker to summarize a report of the broken links found in the
publicfolder. Currently, the report can be seen in the actions Step Summary under the job that runs (example on my fork of ethereum-org)Related Issue
#14823 (comment)
Details
Setting up the minimal implementation of this report in the ci to run monthly so it uses hardly any resources in the ci. My thinking is that I could add a short page to the
docsfolder forbroken-linksdetailing the workflow for fixing broken links across the site and add a main link to theactionstab of the github, specifically to theCheck Links in Public Directorytask and add instructions to the doc to click into the last run report to see broken links that need checking, fixing, and discernment for the proper replacement. This way anyone could hopefully contribute to fixing the broken links across the site and open smaller prs in a reasonable way. By linking to lychee action, no more work would need to go into this task other than adding the appropriatedocswith instrucitons.Otherwise, let me know if you want to write this report to the repo somewhere and lychee has helpful docs on how to do this in which i could adjust in order to do whatever you want with the report, but this feels like overkill. Another option could be to have an ongoing issue that gets posted to from the ci task - lychee has some instructions on posting back to issues doing this could be worked in in which ever way you like