Add schema-docs auto-generation with pre-commit and CI #2949

koxudaxi · 2026-01-07T00:43:24Z

Summary by CodeRabbit

Chores
- Automated workflow added to build and optionally commit updated schema documentation; pre-commit hook runs the same generation locally.
Tests
- New test/task environment to build/validate schema documentation, including a check-only mode for CI validation.
Documentation
- Inserted an auto-generated "Supported Features" section summarizing JSON Schema and OpenAPI feature support; doc generation now supports marker-based updates.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-07T00:43:34Z

📝 Walkthrough

Walkthrough

Adds automated schema documentation generation: a GitHub Actions workflow, pre-commit hook, tox env, and a script change that atomically updates or validates an auto-generated section in docs/supported_formats.md.

Changes

Cohort / File(s)	Summary
CI Workflow `\`.github/workflows/schema-docs.yaml``	New workflow "Update Schema Docs" with triggers (push, pull_request, pull_request_target), detailed execution conditions, fork vs same-repo checkout handling, environment setup, schema docs build, and conditional commit/push of `docs/supported_formats.md`.
Pre-commit & Lint Integration `\`.pre-commit-config.yaml``,` `.github/workflows/lint.yaml``	Added local pre-commit hook `schema-docs` to run `scripts/build_schema_docs.py` (preferring `.tox/schema-docs` Python); extended lint workflow SKIP list to include `schema-docs`.
Test Matrix `\`tox.ini``	Added `[testenv:schema-docs]` to run `python scripts/build_schema_docs.py {posargs}` with a `--check` validation mode.
Docs File `\`docs/supported_formats.md``	Inserted an auto-generated "SUPPORTED FEATURES" section bounded by BEGIN/END markers containing JSON Schema and OpenAPI feature tables.
Docs Build Script `\`scripts/build_schema_docs.py``	Refactored to add `BEGIN_MARKER`/`END_MARKER`, `generate_supported_features_content()`, `generate_supported_features_table()`, `update_docs_file(check: bool)`, added `re` import, and switched main flow to update `docs/supported_formats.md` atomically with a check mode.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Dev as Developer (local)
    participant PreCommit as pre-commit
    participant Tox as tox/.tox/schema-docs
    participant Script as build_schema_docs.py
    participant Repo as Git Repository (files)
    participant Actions as GitHub Actions
    participant Git as Git (commit/push)

    rect rgb(240,248,255)
      Dev->>PreCommit: commit / run pre-commit
      PreCommit->>Tox: invoke schema-docs env or system python
      Tox->>Script: run build_schema_docs.py [--check]
      Script->>Repo: read `docs/supported_formats.md`
      alt check_mode
        Script-->>PreCommit: exit non-zero if outdated
      else update_mode
        Script->>Repo: write updated `docs/supported_formats.md`
      end
    end

    rect rgb(245,255,240)
      Actions->>Repo: triggered (push / PR / labeled PR)
      Actions->>Tox: setup env & run schema-docs env
      Tox->>Script: run build_schema_docs.py
      Script->>Repo: read/compare file contents
      alt changes_detected && actor_allowed
        Script->>Git: stage file
        Actions->>Git: commit (actions bot) & push changes
      else no_changes_or_blocked
        Script-->>Actions: exit (no-op)
      end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

build: Replace pre-commit with prek #2702 — Related CI/pre-commit tooling changes touching .pre-commit-config.yaml, tox.ini, and workflows; likely overlaps in schema-docs / workflow adjustments.

Poem

🐰
I zipped through markers, tidy and neat,
Hop-scoped the docs where schemas meet,
Hooks hum softly, workflows hum too,
Fresh supported features — carrots for you! 🥕

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main change: adding automated schema documentation generation integrated with pre-commit hooks and CI workflows.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Generated by GitHub Actions

github-actions · 2026-01-07T00:44:35Z

📚 Docs Preview: https://pr-2949.datamodel-code-generator.pages.dev

codspeed-hq · 2026-01-07T00:46:57Z

Merging this PR will not alter performance

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 11 untouched benchmarks
⏩ 98 skipped benchmarks¹

_{Comparing feature/schema-docs-automation (dbceca9) with main (fd5e698)}

98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

codecov · 2026-01-07T00:51:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (fd5e698) to head (dbceca9).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##              main     #2949   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           94        94           
  Lines        17696     17696           
  Branches      2037      2037           
=========================================
  Hits         17696     17696

Flag	Coverage Δ
unittests	`100.00% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/workflows/schema-docs.yaml:
- Around line 65-74: The Commit and push if changed step should explicitly
handle the PAT requirement and avoid pushing when there are no commits: add an
explicit check for the PAT secret used in the checkout (token: ${{ secrets.PAT
}}) and fail or emit a clear message if it is not set (so protected-branch
pushes don't silently fall back to GITHUB_TOKEN), and replace the one-liner "git
diff --staged --quiet || git commit -m ...; git push" with an explicit
conditional that runs git commit and git push only when git diff --staged
--quiet indicates changes (e.g., if git diff --staged --quiet; then echo "no
changes"; else git commit -m ... && git push; fi), referencing the workflow step
name "Commit and push if changed" and the use of token: ${{ secrets.PAT }} and
the git commands to locate and update the logic.

🧹 Nitpick comments (2)

.github/workflows/schema-docs.yaml (1)

40-54: Potential issue with overlapping checkout conditions.

The two checkout steps have conditions that could theoretically both evaluate to true for same-repo PRs. While the if conditions are mutually exclusive in practice, the logic is complex and could lead to confusion.

Consider simplifying by using a single checkout step with conditional parameters, or ensure the conditions are truly mutually exclusive by restructuring them.

🔎 Suggested simplification

-      # Checkout for forks (no PAT available)
-      - uses: actions/checkout@v4
-        if: github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name != github.repository
-        with:
-          fetch-depth: 0
-          ref: ${{ github.event.pull_request.head.ref }}
-          repository: ${{ github.event.pull_request.head.repo.full_name }}
-      # Checkout for same-repo PRs, pushes, and pull_request_target
-      - uses: actions/checkout@v4
-        if: github.event_name == 'push' || github.event_name == 'pull_request_target' || github.event.pull_request.head.repo.full_name == github.repository
-        with:
-          fetch-depth: 0
-          ref: ${{ github.event.pull_request.head.ref || github.ref }}
-          repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
-          token: ${{ secrets.PAT }}
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+          ref: ${{ github.event.pull_request.head.ref || github.ref }}
+          repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }}
+          token: ${{ (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository) && secrets.PAT || github.token }}

scripts/build_schema_docs.py (1)

122-145: Consider using append instead of extend with tuple on line 143.

The function correctly generates the content, but line 143 uses extend() with a tuple containing two elements. While this works, it's more idiomatic to use append() for each element or extend() with a list.
🔎 Suggested refactor
     # Get only OpenAPI-specific fields
     json_field_names = {f.name for f in fields(JsonSchemaFeatures)}
     openapi_features = [
         (name, meta) for name, meta in get_feature_metadata(OpenAPISchemaFeatures) if name not in json_field_names
     ]
-    lines.extend((generate_feature_table(openapi_features), ""))
+    lines.append(generate_feature_table(openapi_features))
+    lines.append("")

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd5e698 and ce2b19d.

⛔ Files ignored due to path filters (1)

docs/llms-full.txt is excluded by none and included by none

📒 Files selected for processing (5)

.github/workflows/schema-docs.yaml
.pre-commit-config.yaml
docs/supported_formats.md
scripts/build_schema_docs.py
tox.ini

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: combine coverage

🔇 Additional comments (7)

docs/supported_formats.md (1)

93-117: LGTM! Auto-generated documentation section is well-structured.

The auto-generated section is properly enclosed with markers and contains well-formatted feature tables. The content aligns with the automation introduced in this PR.

.github/workflows/schema-docs.yaml (1)

28-37: Good security design for handling fork PRs.

The job condition properly restricts auto-commit functionality to:

Direct pushes to main

Same-repo PRs

Trusted maintainers

Labeled fork PRs (with safe-to-fix label applied by maintainers)

This prevents untrusted fork PRs from automatically committing changes, which is a good security practice.

tox.ini (1)

114-120: LGTM! Clean and minimal test environment configuration.

The schema-docs environment is appropriately configured with no dependencies (the script uses only standard library + existing project code) and supports both update and check modes via {posargs}.

.pre-commit-config.yaml (1)

46-51: LGTM! Pre-commit hook follows established patterns.

The hook configuration is consistent with the existing readme and config-types hooks in the same file:

Uses the same pattern for locating the Python interpreter in the tox virtualenv

Properly handles both Unix and Windows path conventions

Triggers only on relevant file changes

Falls back to creating the tox environment if needed

scripts/build_schema_docs.py (3)

14-14: LGTM! Proper imports and marker constants.

The re module import is necessary for the pattern matching in update_docs_file(), and the marker constants are clearly defined for the auto-generated content boundaries.

Also applies to: 30-32

153-202: Excellent implementation of atomic docs update with check mode.

The update_docs_file() function is well-designed:

Validates file existence before attempting operations

Uses regex with DOTALL flag to correctly match multi-line content

Provides clear error messages when markers are missing

Implements check mode for CI validation (returns 1 if out of date)

Only writes when content has actually changed

Returns appropriate exit codes for shell usage

The atomic read-update-write pattern ensures the file is only modified when necessary, and the check mode enables CI validation without side effects.

205-224: LGTM! Main function properly delegates to update_docs_file.

The refactored main() function correctly:

Parses command-line arguments for --check and --summary modes

Delegates to update_docs_file() with the check flag

Returns the exit code for proper shell integration

This enables both local usage (updating docs) and CI usage (validating docs are current).

.github/workflows/schema-docs.yaml

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

.github/workflows/schema-docs.yaml (1)
28-37: Review actor-based security logic for pull_request_target.

The condition allows execution when github.actor matches trusted usernames (lines 31-33), which applies to all event types. For pull_request_target, github.actor represents the PR author, not the labeler. This means:

If an attacker creates a username matching a trusted actor (e.g., creates a new account "koxudaxi" on a different domain), they could potentially bypass the label check

The label-based check (lines 34-37) already validates the sender properly using github.event.sender.login

Consider tightening the logic to scope actor checks to specific event types:
🔎 Suggested refinement
     if: |
       github.event_name == 'push' ||
-      !github.event.pull_request.head.repo.fork ||
-      github.actor == 'koxudaxi' ||
-      github.actor == 'gaborbernat' ||
-      github.actor == 'ilovelinux' ||
+      (github.event_name == 'pull_request' && !github.event.pull_request.head.repo.fork) ||
+      (github.event_name == 'pull_request' && (github.actor == 'koxudaxi' || github.actor == 'gaborbernat' || github.actor == 'ilovelinux')) ||
       (github.event_name == 'pull_request_target' && github.event.label.name == 'safe-to-fix' &&
        (github.event.sender.login == 'koxudaxi' ||
         github.event.sender.login == 'gaborbernat' ||
         github.event.sender.login == 'ilovelinux'))
This ensures actor-based bypass only applies to pull_request events, while pull_request_target strictly requires the label AND trusted sender.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50f7ae0 and dbceca9.

📒 Files selected for processing (1)

.github/workflows/schema-docs.yaml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: py312-black24 on Ubuntu
GitHub Check: py312-pydantic1 on Ubuntu
GitHub Check: 3.10 on Windows
GitHub Check: 3.14 on Ubuntu
GitHub Check: 3.12 on Ubuntu
GitHub Check: 3.12 on Windows
GitHub Check: py312-black22 on Ubuntu
GitHub Check: 3.13 on Ubuntu
GitHub Check: 3.11 on Windows
GitHub Check: 3.14 on Windows
GitHub Check: 3.13 on Windows
GitHub Check: benchmarks
GitHub Check: Analyze (python)

🔇 Additional comments (5)

.github/workflows/schema-docs.yaml (5)
23-24: LGTM: Appropriate permissions for auto-commit workflow.

The contents: write permission is correctly set for this workflow, which needs to commit and push documentation updates.

48-54: Validate PAT availability for protected branch scenarios.

The workflow uses secrets.PAT to enable pushing to potentially protected branches. However, if the PAT secret is not configured in the repository, the checkout will silently fall back to GITHUB_TOKEN, which may lack sufficient permissions to push to protected branches.

Consider adding an explicit validation step to verify the PAT is available when needed:
🔎 Suggested validation step

Add this step after the checkout steps:
      - name: Validate PAT for protected branches
        if: github.event_name == 'push' || github.event_name == 'pull_request_target' || github.event.pull_request.head.repo.full_name == github.repository
        run: |
          if [ -z "${{ secrets.PAT }}" ]; then
            echo "::warning::PAT secret is not configured. Push to protected branches may fail."
            echo "::warning::Set the PAT secret if the target branch requires bypass permissions."
          fi
This provides clear feedback to maintainers if the secret is missing.
55-64: LGTM: Clean setup and build steps.

The installation and setup sequence is well-structured:

Uses the official uv setup action

Pins Python 3.13 for the tox tool installation

Creates an isolated tox environment for schema-docs

Executes the build script from the tox environment

This approach ensures reproducible builds with explicit dependencies.

65-76: LGTM: Proper conditional commit logic.

The commit and push step correctly:

Checks for staged changes before committing (line 71: if ! git diff --staged --quiet)

Only pushes when there are actual changes to commit (lines 72-76)

Uses the appropriate bot identity for automated commits

This addresses the previous review feedback about avoiding unnecessary git operations when there are no changes.

16-21: Path filters are still evaluated with pull_request_target and types: [labeled].

The original concern incorrectly stated that path filters may be ignored when combined with types: [labeled]. According to GitHub documentation, path filters are still evaluated against the PR's merge comparison regardless of the event type. This means the workflow will only trigger when both conditions are met: a label is added AND the specified files are changed.

If the intent is to allow maintainers to manually trigger the workflow on any PR using the "safe-to-fix" label, consider removing the path filters from the pull_request_target trigger for clarity, since the job's conditional already restricts access to specific maintainers and the label check is explicit in the conditional logic.

Likely an incorrect or invalid review comment.

github-actions · 2026-01-07T02:26:14Z

Breaking Change Analysis

Result: No breaking changes detected

Reasoning: This PR adds schema-docs auto-generation with pre-commit hooks and CI workflows. All changes are to developer tooling (GitHub workflows, pre-commit config, tox.ini, and internal scripts) and documentation files. No changes affect the public API, CLI, code generation output, templates, default behavior, Python version support, or error handling. This is purely a contributor/developer experience improvement that has no impact on end users of the datamodel-code-generator package.

This analysis was performed by Claude Code Action

Add schema-docs auto-generation with pre-commit and CI

3232e2f

docs: update llms.txt files

ce2b19d

Generated by GitHub Actions

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

.github/workflows/schema-docs.yaml Outdated Show resolved Hide resolved

koxudaxi added 2 commits January 7, 2026 02:01

Add schema-docs to lint SKIP list

50f7ae0

Improve conditional git handling in schema-docs workflow

dbceca9

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

koxudaxi merged commit abac763 into main Jan 7, 2026
38 checks passed

koxudaxi deleted the feature/schema-docs-automation branch January 7, 2026 02:25

github-actions bot added the breaking-change-analyzed label Jan 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add schema-docs auto-generation with pre-commit and CI #2949

Add schema-docs auto-generation with pre-commit and CI #2949

Uh oh!

koxudaxi commented Jan 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 7, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

github-actions bot commented Jan 7, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Jan 7, 2026 •

edited

Loading

Uh oh!

codecov bot commented Jan 7, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add schema-docs auto-generation with pre-commit and CI #2949

Add schema-docs auto-generation with pre-commit and CI #2949

Uh oh!

Conversation

koxudaxi commented Jan 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks

Uh oh!

github-actions bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Summary

Footnotes

Uh oh!

codecov bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jan 7, 2026

Breaking Change Analysis

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

koxudaxi commented Jan 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 7, 2026 •

edited

Loading

github-actions bot commented Jan 7, 2026 •

edited

Loading

codspeed-hq bot commented Jan 7, 2026 •

edited

Loading

codecov bot commented Jan 7, 2026 •

edited

Loading