Skip to content

Migrate docs from monarch-ingest to monarch-app#1258

Draft
kevinschaper wants to merge 4 commits intomainfrom
consolidate-ingest-docs
Draft

Migrate docs from monarch-ingest to monarch-app#1258
kevinschaper wants to merge 4 commits intomainfrom
consolidate-ingest-docs

Conversation

@kevinschaper
Copy link
Member

  • added fetch-docs to makefile
  • doc migration from monarch-ingest

@netlify
Copy link

netlify bot commented Feb 13, 2026

Deploy Preview for monarch-app ready!

Name Link
🔨 Latest commit e9c95dd
🔍 Latest deploy log https://app.netlify.com/projects/monarch-app/deploys/6997a299adeb0800080423a0
😎 Deploy Preview https://deploy-preview-1258--monarch-app.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@claude
Copy link

claude bot commented Feb 13, 2026

Pull Request Review: Migrate docs from monarch-ingest to monarch-app

Summary

This PR migrates documentation from monarch-ingest to monarch-app, adding KG build process documentation, modeling principles, and a framework for automatically fetching source documentation from ingest repositories.

Code Quality & Best Practices

Strengths:

  • Well-structured documentation with clear organization
  • Good use of MkDocs navigation structure in mkdocs.yaml
  • Appropriate .gitignore entries for generated files
  • YAML manifest (sources-manifest.yaml) follows good declarative configuration practices

Issues Found:

1. CRITICAL: Missing Python Script 🔴

The Makefile references scripts/fetch-source-docs.py at line 93:

fetch-docs:
	$(RUN) python $(ROOTDIR)/scripts/fetch-source-docs.py

However, this script does not exist in the PR. This will cause the make fetch-docs and make docs targets to fail.

Recommendation: Add the scripts/fetch-source-docs.py script that reads docs/sources-manifest.yaml and fetches README files from the specified repositories.

2. Spelling Error 📝

In docs/Principles/modeling-principles.md:11-12:

  • "Authoratative" should be "Authoritative" (appears twice)
  • "seperate" should be "separate" (line 13)

3. Spelling Error in KG Build Process 📝

In docs/KG-Build-Process/kg-build-process.md:6:

  • "indepent" should be "independent"
  • "then" should be "them"

Line 67: "danging" should be "dangling"

4. Inconsistent Indentation in mkdocs.yaml

The mkdocs.yaml file mixes spaces and maintains consistent 2-space indentation, which is good. However, verify that all navigation entries are properly aligned.

Potential Bugs & Issues

  1. Broken Build Process: The docs target depends on fetch-docs, but without the Python script, documentation builds will fail in CI/CD pipelines.

  2. Generated Files Not Ignored Properly: The .gitignore pattern docs/Sources/*.md with exception !docs/Sources/index.md is correct, but the index.md file is included in the PR, which is good. Ensure the fetch script respects this pattern.

  3. Dead Links Risk: The documentation references multiple external GitHub repositories and URLs. While most appear valid, the following should be verified:

    • Architecture image path in kg-build-process.md (line 3: architecture.png)
    • All external repository links in sources-manifest.yaml

Performance Considerations

  1. Network I/O: The fetch-docs script (when implemented) will make 25+ GitHub API calls. Consider:

    • Adding rate limiting/retry logic
    • Implementing caching to avoid unnecessary fetches
    • Using GitHub API tokens to increase rate limits
    • Making requests in parallel where possible
  2. Build Time: The docs target now depends on fetch-docs, which will add network latency to the build. Consider:

    • Making fetch-docs optional or separate from local doc builds
    • Caching fetched documentation with timestamps

Security Concerns

  1. Untrusted Content: The fetch script will pull README content from external repositories. Recommendations:

    • Validate that fetched content is actually markdown
    • Sanitize any HTML in the markdown if processed
    • Verify repository ownership before fetching (all should be monarch-initiative org)
    • Consider pinning to specific commits rather than branch heads for reproducibility
  2. Credentials: If the script uses GitHub API, ensure tokens are:

    • Loaded from environment variables, not hardcoded
    • Have minimal required permissions (read-only public repos)
    • Properly documented in setup instructions

Test Coverage

Missing:

  • No tests for the fetch-source-docs.py script (which doesn't exist yet)
  • No validation that the YAML manifest is well-formed
  • No checks that generated markdown files are valid

Recommendations:

  1. Add unit tests for the fetch script when implemented
  2. Add a test that validates sources-manifest.yaml schema
  3. Consider adding a CI check that runs make fetch-docs and verifies output
  4. Add tests to ensure all links in migrated docs are valid

Documentation Quality

Positive:

  • Clear explanation of the KG build process
  • Good architectural diagram inclusion
  • Well-organized sources manifest
  • Helpful inline comments in YAML files

Suggestions:

  1. Add README or comment explaining how to add new sources to the manifest
  2. Document the fetch-docs process in a contributing guide
  3. Add troubleshooting section for common fetch failures

Additional Recommendations

  1. Implementation Priority: Before merging, implement the missing scripts/fetch-source-docs.py. A basic implementation should:

    # Read sources-manifest.yaml
    # For each source:
    #   - Fetch README from repo/branch
    #   - Save to docs/Sources/{filename}
    # Handle errors gracefully
  2. CI/CD: Add a GitHub Actions workflow to:

    • Run make fetch-docs periodically to keep docs fresh
    • Verify the fetch script works on PR builds (or mock it)
  3. Consistency: Ensure the new doc structure aligns with existing monarch-app documentation conventions

  4. Changelog: Consider adding a CHANGELOG entry for this documentation migration

Verdict

Do Not Merge until the missing scripts/fetch-source-docs.py is added. The PR is otherwise well-structured with good documentation content, but the critical missing script will break the build process.

After adding the script and fixing the spelling errors, this will be a solid addition to the documentation infrastructure.


Files to Fix:

  • Add scripts/fetch-source-docs.py
  • Fix spelling: "Authoratative" → "Authoritative" in modeling-principles.md
  • Fix spelling: "seperate" → "separate" in modeling-principles.md
  • Fix spelling: "indepent" → "independent" in kg-build-process.md
  • Fix spelling: "then" → "them" in kg-build-process.md
  • Fix spelling: "danging" → "dangling" in kg-build-process.md
  • Add tests for fetch script
  • Consider adding CI workflow for periodic doc updates

@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.51%. Comparing base (3453d6a) to head (e9c95dd).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1258   +/-   ##
=======================================
  Coverage   73.51%   73.51%           
=======================================
  Files          95       95           
  Lines        3417     3417           
=======================================
  Hits         2512     2512           
  Misses        905      905           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@claude
Copy link

claude bot commented Feb 17, 2026

Follow-up Review (after latest commit: rework mkdocs.yaml top level order)

The mkdocs.yaml navigation reordering looks reasonable — putting KG Data Sources and KG Build sections first makes sense given this PR's focus.

Remaining Blocker

The scripts/fetch-source-docs.py script is still missing from the PR. The Makefile references it in the fetch-docs target. Without this script, both make fetch-docs and make docs will fail. This needs to be added before merging.

Minor Typos (still unaddressed)

  • docs/Principles/modeling-principles.md: 'Authoratative' should be 'Authoritative', 'seperate' should be 'separate'
  • docs/KG-Build-Process/kg-build-process.md: 'indepent' should be 'independent', 'store then' should be 'store them', 'danging edges' should be 'dangling edges' (line 67)

mkdocs.yaml Navigation Note

The new ordering places FastAPI before monarch-py. If monarch-py is the primary library documentation, keeping it higher in the nav may be worth considering — though this is a judgment call for the team.

The documentation content itself (KG build process, modeling principles, sources manifest) is well-written and a useful addition. Once the missing script is added, this PR will be in good shape to merge.

@claude
Copy link

claude bot commented Feb 19, 2026

PR Review: Migrate docs from monarch-ingest to monarch-app

This PR is a solid step toward consolidating documentation. The overall structure is well-thought-out, and the manifest-driven approach for fetching ingest READMEs is clean. A few issues to address before merging:


Critical: Missing scripts/fetch-source-docs.py

The PR adds a make fetch-docs target in the Makefile that calls the script at scripts/fetch-source-docs.py, but this script is not included in the PR. The diff contains 9 changed files and the script is not among them. The local scripts/ directory only has generate_category_enums.py, generate_fixtures.py, and get_publications.py.

This means both make fetch-docs and make docs will fail immediately, since docs now depends on fetch-docs:

docs: install-backend docs/Data-Model fetch-docs  # fetch-docs calls the missing script

The script needs to be added to the PR. Based on the manifest structure (docs/sources-manifest.yaml), it should read each entry and fetch the corresponding README.md from GitHub raw URLs, writing them into docs/Sources/.


Bug: MMRRC missing from mkdocs.yaml nav

The manifest (docs/sources-manifest.yaml) includes mmrrc-ingest -> mmrrc.md, but mmrrc.md is not listed in the mkdocs.yaml nav. The file will be fetched and written to docs/Sources/mmrrc.md but won't be reachable through the documentation site navigation.


make docs now requires network access

Adding fetch-docs as a dependency of docs means building documentation now makes network calls to GitHub. This could be problematic in offline environments, CI pipelines subject to GitHub API rate limits, or restricted build environments.

Consider whether fetch-docs should remain a separate, manually-run step rather than an automatic dependency of docs. The .gitignore entries already treat the fetched files as optional generated artifacts, which suggests this intent. Removing the fetch-docs dependency from docs would restore the original offline-capable build while still allowing make fetch-docs to be run explicitly.


Typos in migrated content

A few typos carried over from the source docs worth fixing during migration:

docs/KG-Build-Process/kg-build-process.md:

  • "A weekly job indepent" -> "independent"
  • "store then on a cloud bucket" -> "store them on a cloud bucket"
  • "danging and denormalized edges" -> "dangling and denormalized edges"

docs/Principles/modeling-principles.md:

  • "Authoratative Source" (heading and body) -> "Authoritative Source"
  • "seperate from edge ingests" -> "separate from edge ingests"

Suggestion: Reduce manifest/nav sync burden

Currently, adding a new source requires updating two files: docs/sources-manifest.yaml (for fetching) and mkdocs.yaml (for navigation). This is already evident from the MMRRC omission above. Consider having the fetch script auto-generate the nav section for KG Data Sources in mkdocs.yaml, or at least add a comment to the manifest reminding contributors to also update the nav.


Minor: docs/sources-manifest.yaml location

This file is config for the fetch script rather than documentation content. scripts/sources-manifest.yaml or a top-level location might be clearer, though this is a minor preference.


Overall the approach is good and the new content is a welcome addition. The missing script is a blocker — once that is added and the MMRRC nav entry is included, this should be in good shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments