Skip to content

Conversation

@Edu92337
Copy link

Description

What is this PR

  • Addition of a new feature

Why is this PR needed?

This PR addresses issue #645. Currently, when the movement documentation site (https://movement.neuroinformatics.dev/) is offline and needs to be redeployed, the Sphinx linkcheck tool fails because some documentation files contain external URLs referencing the movement site itself. This creates a circular dependency that prevents the docs from being rebuilt.

What does this PR do?

This PR adds automated detection of external self-references in documentation files. It creates:

  1. A Python script (docs/check_self_references.py) that scans documentation files for external movement URLs and suggests using internal MyST cross-references instead
  2. Integration with the project's Makefile and pre-commit hooks to catch violations early
  3. Comprehensive unit tests to ensure the detection logic works correctly

The script intelligently allowlists legitimate external URLs (like the version switcher JSON) while flagging URLs that should use internal cross-references (e.g., target-installation instead of https://movement.neuroinformatics.dev/latest/user_guide/installation.html).

References

Closes #645

Builds upon the work done in PR #642 which replaced some external links with internal targets.

How has this PR been tested?

Manual Testing:

  • Ran python docs/check_self_references.py on current documentation - no violations found
  • Verified allowlist correctly excludes version switcher URLs in conf.py and workflow files
  • Confirmed target suggestions work for all mapped URLs
  • Tested that README.md is correctly excluded (it intentionally uses external URLs for GitHub display)
  • Created test files with violations to verify detection works
  • Verified Makefile integration: cd docs && make selfcheck

Unit Tests:

Created comprehensive unit tests in tests/test_unit/test_docs/test_check_self_references.py:

  • TestFindSelfReferences - Detection of URLs in files, multiple URLs, anchors, line numbers
  • TestIsAllowed - Allowlist patterns for switcher JSON and regular URLs
  • TestSuggestTarget - URL-to-target mapping for known and unknown URLs
  • TestIntegration - Clean files and mixed content scenarios

Note: Unit tests were validated manually due to environment constraints. They will run properly in CI where project dependencies are installed.

Pre-commit Integration:

  • Added local hook configuration in .pre-commit-config.yaml
  • Hook triggers on .md and .rst files in docs/source/

Is this a breaking change?

No, this PR does not break any existing functionality. It only adds a new validation tool that can be run optionally via:

  • python docs/check_self_references.py (direct execution)
  • make selfcheck (via Makefile)
  • pre-commit run check-self-references (via pre-commit)

The current documentation already passes all checks.

Does this PR require an update to the documentation?

No documentation update is required because:

  1. This is a developer-facing tool, not a user-facing feature
  2. The tool is self-documenting with clear error messages
  3. Usage is straightforward and follows existing project patterns (similar to make_api.py)

However, developers can discover the tool via:

  • The selfcheck target in docs/Makefile
  • The pre-commit hook (when they run pre-commit install)
  • This PR description and the code comments

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality
  • The documentation has been updated to reflect any changes (N/A - tool is self-documenting)
  • The code has been formatted with pre-commit

Additional Notes

Design Decisions:

  1. Script location: Placed in docs/ to follow the pattern of existing documentation tools (make_api.py, convert_admonitions.py)

  2. README.md exclusion: The README is intentionally not checked because it's displayed on GitHub where MyST cross-references don't work - external URLs are necessary there

  3. Allowlist approach: Version switcher URLs in conf.py and workflow files are legitimate and should not trigger violations

  4. Direct file scanning vs. linkcheck parsing: Direct regex scanning is simpler, faster, and doesn't require linkcheck to have run first

Future Enhancements:

This tool could potentially be generalized and moved to the neuroinformatics-unit actions repository for use across multiple NIU projects (as mentioned in the original issue).

@niksirbi niksirbi mentioned this pull request Jan 20, 2026
7 tasks
@sonarqubecloud
Copy link

@codecov
Copy link

codecov bot commented Jan 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (665412f) to head (c5e6932).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #765   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           34        34           
  Lines         2111      2111           
=========================================
  Hits          2111      2111           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Detect references to the movement external URL in docs

1 participant