fix(build): fix mkdocs htmlproofer link validation flakiness#2095
fix(build): fix mkdocs htmlproofer link validation flakiness#2095khushiiagrawal wants to merge 10 commits intooscal-compass:developfrom
Conversation
Signed-off-by: khushiiagrawal <khushisaritaagrawal@gmail.com>
degenaro
left a comment
There was a problem hiding this comment.
@khushiiagrawal Thanks for this PR. It seems that 503 and 504 will now be ignored. Is there a warning about such URLs so that the PR reviewer can judge whether this of concern or not? Are there any retrys before the ignore/warning happens?
… for MkDocs link validation Signed-off-by: khushiiagrawal <khushisaritaagrawal@gmail.com>
@degenaro thanks for the review, these are valid points and worth addressing. i’ve looked into both. retries: i’ve updated mkdocs.yml to add retry_max_times: 3 for the htmlproofer plugin. this allows urls returning transient errors like 503/504 to be retried up to three times before being treated as broken or ignored. warnings: i also tried enabling warnings for ignored urls using warn_on_ignored_urls: true so they would appear in the ci logs. however, since the pipeline runs mkdocs build -s in strict mode, any warning is treated as an error and fails the build. because of that, emitting warnings for these urls would cause the job to fail, which defeats the purpose of ignoring them. given the current strict setup, we can’t surface them as non-fatal warnings in ci. thanks. |
|
This will cause 503/504 to always be ignored in the github pipeline. I don't suppose there's a way when running the Makefile locally on my laptop to not ignore 503/504, unless the yml file is hacked? Do we know if the URLs currently in the ignore list still fail and if so is it because of 503/504? Funny, I just enabled the pipelines to run and there was an htmlproofer failure with 502. |
I think we can do environmental variable overrides in the mkdocs.yml. You can see an example of those changes Here; - htmlproofer:
enabled: !ENV [ENABLED_HTMLPROOFER, False]
validate_rendered_template: False
validate_external_urls: False
raise_error_after_finish: False
raise_error: Falsewe'd need to chain that into the makefile. |
|
develop branch has been updated. @khushiiagrawal Please resolve conflicts. |
Signed-off-by: khushiiagrawal <khushisaritaagrawal@gmail.com>
|
thanks @degenaro and @butler54 for the review. i've reworked the PR based on both your suggestions. the htmlproofer config now uses
defaults are still true, so without setting anything you get full validation locally. also added 502 to the exclusions since that was failing too, and kept retry_max_times: 3 so transient errors get retried before being excluded. reverted the unrelated pyproject.toml and workflow changes to keep the PR focused. |
|
@khushiiagrawal Seems to still be flaky. Actually worse than flaky today. Cannot get lint pipeline to pass on other PRs either. Network must be particularly bad for some reason. |
|
@degenaro Yeah, this is a broader network issue , I'm seeing it affect other PRs too. I'll switch the CI to HTMLPROOFER_VALIDATE_EXTERNAL_URLS=false so the lint job only validates internal links. External URL checks will still work locally by default for anyone who wants them. please let me know if that works . |
|
@khushiiagrawal In the end the website should have not have broken links. There may be times when these sites are not available for a variety of reasons and not much can be done about that. We should not err on the side accepting broken links (external or internal) for the sake of expediency. The whole idea of checking is that we have not introduced a new link that is broken (or happen to come across an existing link that has since broken). |
|
@khushiiagrawal @butler54 @vikas-agarwal76 Here is what I was thinking. The htmlproofer would never block a PR. What would happen is:
Before any coding is done, does this seem like a reasonable solution? |
|
The discussion at the community meeting resulted in favoring the "reasonable solution" approach above. |
…tings Signed-off-by: khushiiagrawal <khushisaritaagrawal@gmail.com>
|
thanks @degenaro for confirming the approach at the community meeting. implemented the agreed solution, htmlproofer now runs in a separate pipeline that never blocks merging. failing URLs get posted as a PR comment (reruns update the same comment, no duplicates). the main lint job no longer runs htmlproofer, so it stays stable. author and reviewers can decide if any failures are worth blocking on. |
Types of changes
develop->main)Quality assurance (all should be covered).
Summary
Resolves #2032
The docs validate step in the CI pipeline is intermittently failing due to 503 Service Unavailable responses from valid external links.
This PR fixes the flakiness by configuring the mkdocs-htmlproofer-plugin to ignore 503 (Service Unavailable) and 504 (Gateway Timeout) HTTP status codes natively. Since these are temporary server-side issues rather than broken links in our documentation, explicitly excluding them from the failure checks stabilizes the CI without losing the benefits of validating 404s.