Add script to detect external self-references in docs #765
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.



Description
What is this PR
Why is this PR needed?
This PR addresses issue #645. Currently, when the movement documentation site (
https://movement.neuroinformatics.dev/) is offline and needs to be redeployed, the Sphinxlinkchecktool fails because some documentation files contain external URLs referencing the movement site itself. This creates a circular dependency that prevents the docs from being rebuilt.What does this PR do?
This PR adds automated detection of external self-references in documentation files. It creates:
docs/check_self_references.py) that scans documentation files for external movement URLs and suggests using internal MyST cross-references insteadThe script intelligently allowlists legitimate external URLs (like the version switcher JSON) while flagging URLs that should use internal cross-references (e.g.,
target-installationinstead ofhttps://movement.neuroinformatics.dev/latest/user_guide/installation.html).References
Closes #645
Builds upon the work done in PR #642 which replaced some external links with internal targets.
How has this PR been tested?
Manual Testing:
python docs/check_self_references.pyon current documentation - no violations foundconf.pyand workflow filescd docs && make selfcheckUnit Tests:
Created comprehensive unit tests in
tests/test_unit/test_docs/test_check_self_references.py:TestFindSelfReferences- Detection of URLs in files, multiple URLs, anchors, line numbersTestIsAllowed- Allowlist patterns for switcher JSON and regular URLsTestSuggestTarget- URL-to-target mapping for known and unknown URLsTestIntegration- Clean files and mixed content scenariosNote: Unit tests were validated manually due to environment constraints. They will run properly in CI where project dependencies are installed.
Pre-commit Integration:
.pre-commit-config.yaml.mdand.rstfiles indocs/source/Is this a breaking change?
No, this PR does not break any existing functionality. It only adds a new validation tool that can be run optionally via:
python docs/check_self_references.py(direct execution)make selfcheck(via Makefile)pre-commit run check-self-references(via pre-commit)The current documentation already passes all checks.
Does this PR require an update to the documentation?
No documentation update is required because:
make_api.py)However, developers can discover the tool via:
selfchecktarget indocs/Makefilepre-commit install)Checklist:
Additional Notes
Design Decisions:
Script location: Placed in
docs/to follow the pattern of existing documentation tools (make_api.py,convert_admonitions.py)README.md exclusion: The README is intentionally not checked because it's displayed on GitHub where MyST cross-references don't work - external URLs are necessary there
Allowlist approach: Version switcher URLs in
conf.pyand workflow files are legitimate and should not trigger violationsDirect file scanning vs. linkcheck parsing: Direct regex scanning is simpler, faster, and doesn't require linkcheck to have run first
Future Enhancements:
This tool could potentially be generalized and moved to the neuroinformatics-unit actions repository for use across multiple NIU projects (as mentioned in the original issue).