Skip to content

Conversation

@dgarros
Copy link
Collaborator

@dgarros dgarros commented Oct 15, 2025

This PR refactor how we generate and validate the GraphQL schema by replacing the current invoke-based approach with a pure python approach with less dependencies

Key Improvements

⚡ Faster CI Pipeline

Before: CI had to build Docker images, start a Neo4j database, and install npm packages just to generate a schema
After: CI now generates schemas directly without any database or Docker setup
Result: Significantly faster CI execution and reduced resource usage

🧹 Cleaner Dependencies

Removed: npm packages (format-graphql) that were only used for schema formatting
Removed: Database connection requirements for schema generation
Result: Simpler project setup and fewer dependencies to maintain

🎯 Better Schema Sorting

Before: Used regex patterns to sort GraphQL interfaces (unreliable)
After: Uses proper AST-based sorting for all schema elements
Result: More accurate and comprehensive sorting of interfaces, types, fields, and enum values

🔄 Maintained Compatibility

Invoke tasks still work:
poetry run invoke schema.generate-graphqlschema and poetry run invoke schema.validate-graphqlschema
Same output: Generated schemas are identical but now properly sorted
No breaking changes: Existing workflows continue to work

What This Means for You

Faster development: CI runs faster, so you get feedback quicker
More reliable: Schema sorting is now more accurate and comprehensive
Easier maintenance: Fewer dependencies and simpler codebase
Same experience: All existing commands and workflows continue to work

Summary by CodeRabbit

  • New Features

    • GraphQL schema endpoint now supports an optional sorting parameter for alphabetically organized schema output.
    • New internal development CLI commands added for schema export, database initialization, and test data loading.
  • Chores

    • Simplified CI workflow by removing redundant setup steps.
    • Reorganized CLI database commands and development utilities.

@dgarros dgarros requested a review from a team as a code owner October 15, 2025 05:37
@dgarros dgarros force-pushed the dga-20251015-export-graphql branch from c30fce5 to 83d9c39 Compare October 15, 2025 05:38
@github-actions github-actions bot added group/backend Issue related to the backend (API Server, Git Agent) group/ci Issue related to the CI pipeline labels Oct 15, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 15, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (3)
  • main
  • stable
  • develop

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This pull request refactors Infrahub's schema generation and CLI infrastructure. A new dev CLI subcommand module is introduced with three asynchronous operations: exporting sorted GraphQL schemas, initializing databases, and loading test data. A new GraphQL schema AST sorting module is implemented to sort definitions, fields, arguments, and other schema elements. Development commands previously in the database CLI module are removed and consolidated into the new dev module. The GraphQL schema API endpoint is extended with an optional sort_schema parameter. The build workflow is updated to use CLI-based schema generation. Tests are added to cover schema sorting functionality and the new sorted schema endpoint parameter. Related build configuration files (schema/package.json) and lint rules are adjusted accordingly.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "Clean up GraphQL schema export" is directly aligned with the primary objectives outlined in the PR description. The changeset fundamentally refactors the GraphQL schema generation and export mechanism by replacing an invoke-based approach with a pure Python CLI-based implementation, removing external dependencies (Docker, Neo4j, npm packages), and improving schema sorting through AST-based transformation. The title accurately captures this core refactoring work as the main change while remaining concise and specific. A reader scanning the commit history would clearly understand that this PR involves schema export improvements and consolidation.
Docstring Coverage ✅ Passed Docstring coverage is 87.88% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 15, 2025

Walkthrough

This PR introduces a new Infrahub CLI “schema” subcommand with an export_graphql_schema command that generates and writes a deterministically sorted GraphQL schema. A new module sorts GraphQL AST definitions and nested elements; comprehensive unit tests validate sorting behavior and idempotence. CI workflow steps for the graphql-schema job are reduced to only run validation. Schema generation tasks now call the CLI via Poetry and compare schema/schema.graphql in Git. The schema/package.json and its format-graphql dependency are removed. A lint per-file ignore is added for a new test file.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly reflects the primary task of cleaning up the GraphQL schema export process, aligning with the removal of external dependencies and the streamlined Python-based implementation described in the PR. It is concise and clear, avoiding unnecessary detail while indicating the main focus of the change. Team members scanning the history will immediately understand that the PR refactors schema export.
Docstring Coverage ✅ Passed Docstring coverage is 96.30% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dga-20251015-export-graphql

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tasks/schema.py (1)

21-37: Respect the --config-file option

export_graphql_schema accepts config_file, but this task never forwards it. As written we always load infrahub.toml, so running poetry run invoke schema.generate-graphqlschema --config-file alt.toml silently ignores the user-specified configuration. Please pass the option through, e.g.:

-    config.load_and_exit()
+    config.load_and_exit(config_file_name=config_file)

and update the invoke wrapper accordingly if extra arguments are supported.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f13498c and 83d9c39.

⛔ Files ignored due to path filters (1)
  • schema/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (8)
  • .github/workflows/ci.yml (0 hunks)
  • backend/infrahub/cli/__init__.py (2 hunks)
  • backend/infrahub/cli/schema.py (1 hunks)
  • backend/infrahub/graphql/schema_sort.py (1 hunks)
  • backend/tests/unit/graphql/test_schema_sort.py (1 hunks)
  • pyproject.toml (1 hunks)
  • schema/package.json (0 hunks)
  • tasks/schema.py (2 hunks)
💤 Files with no reviewable changes (2)
  • .github/workflows/ci.yml
  • schema/package.json
🧰 Additional context used
📓 Path-based instructions (5)
backend/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Run backend tests with pytest or via invoke tasks

Files:

  • backend/infrahub/cli/__init__.py
  • backend/infrahub/graphql/schema_sort.py
  • backend/infrahub/cli/schema.py
  • backend/tests/unit/graphql/test_schema_sort.py
**/*.py

📄 CodeRabbit inference engine (.github/instructions/python-docstring.instructions.md)

**/*.py: Use triple double quotes (""") for all Python docstrings
Write docstrings in Google-style format
Include a brief one-line description at the top of each docstring
Add a detailed description section when additional context is needed
Document function/method parameters under an Args/Parameters section without typing information
Include a Returns section describing the return value
Include a Raises section listing possible exceptions
Provide an Examples section demonstrating usage when helpful

Use ruff and mypy to validate and lint Python files

**/*.py: Use type hints for all Python function parameters and return values
Prefer asynchronous code in Python when feasible
Define asynchronous Python functions with async def
Use await for asynchronous calls in Python
Use Pydantic models instead of standard dataclasses
Use ruff and mypy for linting and type checking

**/*.py: Use type hints for all Python function parameters and return values
Use async/await whenever possible in Python code
Define asynchronous functions with async def
Await asynchronous calls with await
Use Pydantic models instead of standard dataclasses for data modeling
Use triple quotes (""") for all Python docstrings
Write docstrings in Google-style format
Include docstring sections when applicable: one-line summary, optional details, Args (without types), Returns, Raises, Examples
Validate and lint Python with ruff and mypy

Files:

  • backend/infrahub/cli/__init__.py
  • backend/infrahub/graphql/schema_sort.py
  • backend/infrahub/cli/schema.py
  • backend/tests/unit/graphql/test_schema_sort.py
  • tasks/schema.py
{pyproject.toml,poetry.lock}

📄 CodeRabbit inference engine (.github/instructions/tooling.instructions.md)

Use Poetry to manage the Python project and its dependencies

Files:

  • pyproject.toml
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

Use Poetry to manage the Python project and dependencies

Files:

  • pyproject.toml
backend/tests/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Place backend tests in backend/tests/

Files:

  • backend/tests/unit/graphql/test_schema_sort.py
🧬 Code graph analysis (2)
backend/infrahub/cli/schema.py (5)
backend/infrahub/core/schema/__init__.py (1)
  • SchemaRoot (47-99)
backend/infrahub/core/schema/schema_branch.py (2)
  • SchemaBranch (75-2433)
  • process (508-512)
backend/infrahub/graphql/manager.py (1)
  • GraphQLSchemaManager (121-1126)
backend/infrahub/graphql/schema_sort.py (1)
  • sort_schema_ast (97-157)
backend/infrahub/config.py (1)
  • load_and_exit (990-1007)
backend/tests/unit/graphql/test_schema_sort.py (1)
backend/infrahub/graphql/schema_sort.py (1)
  • sort_schema_ast (97-157)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: coverall-report
  • GitHub Check: documentation
  • GitHub Check: validate-generated-documentation

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
backend/tests/unit/graphql/test_schema_sort.py (1)

318-373: Rename test or add actual metadata validation.

The test name and docstring claim to verify preservation of directives and descriptions, but the implementation only validates basic sorting behavior. Either:

  1. Rename to test_sort_schema_ast_basic_structure to match actual behavior, or
  2. Add schema elements with actual directives and descriptions, then verify they're preserved after sorting.

Example schema with metadata:

"""Test @deprecated directive"""
type User @key(fields: "id") {
  "User's email address"
  email: String!
  name: String! @deprecated(reason: "Use fullName instead")
  fullName: String!
}

Then verify the directive and description remain after sorting.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f13498c and 83d9c39.

⛔ Files ignored due to path filters (1)
  • schema/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (8)
  • .github/workflows/ci.yml (0 hunks)
  • backend/infrahub/cli/__init__.py (2 hunks)
  • backend/infrahub/cli/schema.py (1 hunks)
  • backend/infrahub/graphql/schema_sort.py (1 hunks)
  • backend/tests/unit/graphql/test_schema_sort.py (1 hunks)
  • pyproject.toml (1 hunks)
  • schema/package.json (0 hunks)
  • tasks/schema.py (2 hunks)
💤 Files with no reviewable changes (2)
  • schema/package.json
  • .github/workflows/ci.yml
🧰 Additional context used
📓 Path-based instructions (5)
backend/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Run backend tests with pytest or via invoke tasks

Files:

  • backend/infrahub/graphql/schema_sort.py
  • backend/tests/unit/graphql/test_schema_sort.py
  • backend/infrahub/cli/schema.py
  • backend/infrahub/cli/__init__.py
**/*.py

📄 CodeRabbit inference engine (.github/instructions/python-docstring.instructions.md)

**/*.py: Use triple double quotes (""") for all Python docstrings
Write docstrings in Google-style format
Include a brief one-line description at the top of each docstring
Add a detailed description section when additional context is needed
Document function/method parameters under an Args/Parameters section without typing information
Include a Returns section describing the return value
Include a Raises section listing possible exceptions
Provide an Examples section demonstrating usage when helpful

Use ruff and mypy to validate and lint Python files

**/*.py: Use type hints for all Python function parameters and return values
Prefer asynchronous code in Python when feasible
Define asynchronous Python functions with async def
Use await for asynchronous calls in Python
Use Pydantic models instead of standard dataclasses
Use ruff and mypy for linting and type checking

**/*.py: Use type hints for all Python function parameters and return values
Use async/await whenever possible in Python code
Define asynchronous functions with async def
Await asynchronous calls with await
Use Pydantic models instead of standard dataclasses for data modeling
Use triple quotes (""") for all Python docstrings
Write docstrings in Google-style format
Include docstring sections when applicable: one-line summary, optional details, Args (without types), Returns, Raises, Examples
Validate and lint Python with ruff and mypy

Files:

  • backend/infrahub/graphql/schema_sort.py
  • backend/tests/unit/graphql/test_schema_sort.py
  • backend/infrahub/cli/schema.py
  • tasks/schema.py
  • backend/infrahub/cli/__init__.py
backend/tests/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Place backend tests in backend/tests/

Files:

  • backend/tests/unit/graphql/test_schema_sort.py
{pyproject.toml,poetry.lock}

📄 CodeRabbit inference engine (.github/instructions/tooling.instructions.md)

Use Poetry to manage the Python project and its dependencies

Files:

  • pyproject.toml
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

Use Poetry to manage the Python project and dependencies

Files:

  • pyproject.toml
🧬 Code graph analysis (2)
backend/tests/unit/graphql/test_schema_sort.py (1)
backend/infrahub/graphql/schema_sort.py (1)
  • sort_schema_ast (97-157)
backend/infrahub/cli/schema.py (5)
backend/infrahub/core/schema/__init__.py (1)
  • SchemaRoot (47-99)
backend/infrahub/core/schema/schema_branch.py (2)
  • SchemaBranch (73-2338)
  • process (500-504)
backend/infrahub/graphql/manager.py (1)
  • GraphQLSchemaManager (121-1126)
backend/infrahub/graphql/schema_sort.py (1)
  • sort_schema_ast (97-157)
backend/infrahub/config.py (1)
  • load_and_exit (990-1007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: coverall-report
  • GitHub Check: documentation
  • GitHub Check: validate-generated-documentation
🔇 Additional comments (5)
backend/tests/unit/graphql/test_schema_sort.py (5)

1-14: LGTM!

Module docstring and imports are well-structured. The imports cover all necessary GraphQL AST types and testing utilities.


16-166: LGTM!

Test fixtures are well-structured with proper docstrings and type hints. The fixtures cover basic, complex, and edge-case scenarios effectively. The schema strings appropriately demonstrate unsorted inputs and expected sorted outputs.


169-287: LGTM!

The test suite comprehensively validates sorting behavior across different schema elements (definitions, fields, arguments, enum values). Each test is focused and has clear assertions. The use of print_ast for string comparison in the basic sorting test is appropriate.


289-315: LGTM!

Edge case tests appropriately cover empty documents and single-definition scenarios. These tests ensure robustness of the sorting function.


376-465: LGTM!

Excellent coverage of idempotence and interface sorting. The idempotence test ensures sorting is stable, and the interface sorting tests validate alphabetical ordering of implemented interfaces across different schema structures.

Copy link
Contributor

@ogenstad ogenstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good change to me though some work remaining. A though could be to take the existing ./schema/schema.graphql file from develop and run it through this function and then see if the output is the same as what we get from running infrahub schema export-graphql-schema.

include_mutation=True,
include_subscription=True,
include_types=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to start to phase out these arguments to generate. As we can how cache the schema in by hash we end up with some occational flaky unittests and I don't think we ever use anything other than =True in the production we only ever use these arguments within the tests.

This could be mitigated if we had a unittest that would run this command similar to how we test the CTL scripts in the SDK and later validate that we can correctly read the output GraphQL file if we write it to a temp folder.

@dgarros dgarros force-pushed the dga-20251015-export-graphql branch from 83d9c39 to a0e273d Compare October 20, 2025 06:19
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 20, 2025

CodSpeed Performance Report

Merging #7389 will not alter performance

Comparing dga-20251015-export-graphql (a077aeb) with release-1.5 (199c651)

Summary

✅ 10 untouched

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (6)
backend/tests/unit/api/test_20_graphql.py (1)

235-245: Avoid global config leakage; use monkeypatch and tighten assertions

  • Mutating config.SETTINGS.main.allow_anonymous_access without restoring leaks state across tests. Use monkeypatch to auto-restore.
  • client_headers param is unused; remove it.
  • Optional: assert Content-Type and that the returned schema parses.
-@pytest.mark.parametrize("allow_anonymous_access", [False, True])
-async def test_download_graphql_schema_sorted(
-    db: InfrahubDatabase, client, client_headers, allow_anonymous_access: bool
-):
-    config.SETTINGS.main.allow_anonymous_access = allow_anonymous_access
+@pytest.mark.parametrize("allow_anonymous_access", [False, True])
+async def test_download_graphql_schema_sorted(
+    db: InfrahubDatabase, client, allow_anonymous_access: bool, monkeypatch
+):
+    monkeypatch.setattr(
+        config.SETTINGS.main, "allow_anonymous_access", allow_anonymous_access, raising=False
+    )
@@
     with client:
         response = client.get("/schema.graphql?sorted=true")
-        assert response.text
-        assert response.status_code == 200 if allow_anonymous_access else 401
+        assert response.text
+        # Content-Type from FastAPI/Starlette is usually "text/plain; charset=utf-8"
+        assert response.headers["content-type"].startswith("text/plain")
+        assert response.status_code == (200 if allow_anonymous_access else 401)

If you want to additionally validate the payload is a valid GraphQL document:

# at file top
from graphql import parse

and after the request:

parse(response.text)
tasks/schema.py (1)

32-36: Consider failing on untracked changes and CI hints

git diff --exit-code ignores untracked files. If schema/schema.graphql were ever recreated or missing from the index, this would pass. Consider adding a guard:

-    exec_cmd = "git diff --exit-code schema/schema.graphql"
+    # Ensure status covers modified and untracked schema file
+    exec_cmd = "git status --porcelain -- schema/schema.graphql | tee /dev/stderr | wc -l | xargs -I{} test {} -eq 0"

Alternatively, keep git diff but precede it with:

test -f schema/schema.graphql || { echo 'schema/schema.graphql missing'; exit 1; }
backend/infrahub/cli/dev.py (3)

57-63: Make export robust: create parent dir, write with encoding and trailing newline

If schema/ doesn’t exist (fresh repo, different CWD), write will fail. Ensure directory exists and write deterministically.

-    out.write_text(sorted_schema_str)
+    out.parent.mkdir(parents=True, exist_ok=True)
+    out.write_text(sorted_schema_str + "\n", encoding="utf-8")

82-93: Ensure DB driver is always closed

Wrap close() in a finally to avoid leaking connections on exceptions.

-    dbdriver = await context.init_db(retry=1)
-    async with dbdriver.start_transaction() as db:
-        log.info("Delete All Nodes")
-        await delete_all_nodes(db=db)
-        await first_time_initialization(db=db)
-
-    await dbdriver.close()
+    dbdriver = await context.init_db(retry=1)
+    try:
+        async with dbdriver.start_transaction() as db:
+            log.info("Delete All Nodes")
+            await delete_all_nodes(db=db)
+            await first_time_initialization(db=db)
+    finally:
+        await dbdriver.close()

105-123: Same here: guard driver close and tidy logging

  • Use try/finally to guarantee close.
  • Use logging.DEBUG instead of the string "DEBUG".
  • Remove the no-op logging.getLogger("infrahub") line or set a level/handler.
-    dbdriver = await context.init_db(retry=1)
-
-    async with dbdriver.start_session() as db:
-        await initialization(db=db)
-
-        log_level = "DEBUG"
-
-        FORMAT = "%(message)s"
-        logging.basicConfig(level=log_level, format=FORMAT, datefmt="[%X]", handlers=[RichHandler()])
-        logging.getLogger("infrahub")
-
-        dataset_module = importlib.import_module(f"infrahub.test_data.{dataset}")
-        await dataset_module.load_data(db=db)
-
-    await dbdriver.close()
+    dbdriver = await context.init_db(retry=1)
+    try:
+        async with dbdriver.start_session() as db:
+            await initialization(db=db)
+
+            FORMAT = "%(message)s"
+            logging.basicConfig(level=logging.DEBUG, format=FORMAT, datefmt="[%X]", handlers=[RichHandler()])
+            # Optionally: logging.getLogger("infrahub").setLevel(logging.DEBUG)
+
+            dataset_module = importlib.import_module(f"infrahub.test_data.{dataset}")
+            await dataset_module.load_data(db=db)
+    finally:
+        await dbdriver.close()
backend/infrahub/graphql/api/endpoints.py (1)

31-47: Declare response_class and add a brief docstring

This improves OpenAPI accuracy (text/plain) and aligns with docstring guidelines.

-@router.get("/schema.graphql")
+@router.get("/schema.graphql", response_class=PlainTextResponse)
 async def get_graphql_schema(
     branch: Branch = Depends(get_branch_dep),
     _: AccountSession = Depends(get_current_user),
     sort_schema: bool = Query(default=False, alias="sorted", description="Whether to sort the schema alphabetically."),
 ) -> PlainTextResponse:
+    """Return the GraphQL schema for the selected branch.
+
+    Args:
+        branch: Resolved branch dependency (from ?branch=).
+        _: Current user/session (auth enforced by dependency).
+        sort_schema: When true, return an AST-sorted schema for stable diffs.
+
+    Returns:
+        Plain text GraphQL SDL.
+    """
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83d9c39 and a0e273d.

⛔ Files ignored due to path filters (1)
  • schema/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (11)
  • .github/workflows/ci.yml (0 hunks)
  • backend/infrahub/cli/__init__.py (2 hunks)
  • backend/infrahub/cli/db.py (0 hunks)
  • backend/infrahub/cli/dev.py (1 hunks)
  • backend/infrahub/graphql/api/endpoints.py (2 hunks)
  • backend/infrahub/graphql/schema_sort.py (1 hunks)
  • backend/tests/unit/api/test_20_graphql.py (1 hunks)
  • backend/tests/unit/graphql/test_schema_sort.py (1 hunks)
  • pyproject.toml (1 hunks)
  • schema/package.json (0 hunks)
  • tasks/schema.py (2 hunks)
💤 Files with no reviewable changes (3)
  • .github/workflows/ci.yml
  • backend/infrahub/cli/db.py
  • schema/package.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/infrahub/cli/init.py
  • pyproject.toml
🧰 Additional context used
📓 Path-based instructions (3)
backend/tests/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Place backend tests in backend/tests/

Files:

  • backend/tests/unit/api/test_20_graphql.py
  • backend/tests/unit/graphql/test_schema_sort.py
backend/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Run backend tests with pytest or via invoke tasks

Files:

  • backend/tests/unit/api/test_20_graphql.py
  • backend/tests/unit/graphql/test_schema_sort.py
  • backend/infrahub/graphql/schema_sort.py
  • backend/infrahub/cli/dev.py
  • backend/infrahub/graphql/api/endpoints.py
**/*.py

📄 CodeRabbit inference engine (.github/instructions/python-docstring.instructions.md)

**/*.py: Use triple double quotes (""") for all Python docstrings
Write docstrings in Google-style format
Include a brief one-line description at the top of each docstring
Add a detailed description section when additional context is needed
Document function/method parameters under an Args/Parameters section without typing information
Include a Returns section describing the return value
Include a Raises section listing possible exceptions
Provide an Examples section demonstrating usage when helpful

Use ruff and mypy to validate and lint Python files

**/*.py: Use type hints for all Python function parameters and return values
Prefer asynchronous code in Python when feasible
Define asynchronous Python functions with async def
Use await for asynchronous calls in Python
Use Pydantic models instead of standard dataclasses
Use ruff and mypy for linting and type checking

**/*.py: Use type hints for all Python function parameters and return values
Use async/await whenever possible in Python code
Define asynchronous functions with async def
Await asynchronous calls with await
Use Pydantic models instead of standard dataclasses for data modeling
Use triple quotes (""") for all Python docstrings
Write docstrings in Google-style format
Include docstring sections when applicable: one-line summary, optional details, Args (without types), Returns, Raises, Examples
Validate and lint Python with ruff and mypy

Files:

  • backend/tests/unit/api/test_20_graphql.py
  • tasks/schema.py
  • backend/tests/unit/graphql/test_schema_sort.py
  • backend/infrahub/graphql/schema_sort.py
  • backend/infrahub/cli/dev.py
  • backend/infrahub/graphql/api/endpoints.py
🧬 Code graph analysis (4)
backend/tests/unit/api/test_20_graphql.py (1)
backend/tests/unit/api/conftest.py (1)
  • client_headers (25-26)
backend/tests/unit/graphql/test_schema_sort.py (1)
backend/infrahub/graphql/schema_sort.py (1)
  • sort_schema_ast (104-170)
backend/infrahub/cli/dev.py (6)
backend/infrahub/config.py (1)
  • load_and_exit (990-1007)
backend/infrahub/core/initialization.py (1)
  • first_time_initialization (511-571)
backend/infrahub/core/utils.py (1)
  • delete_all_nodes (160-168)
backend/infrahub/graphql/manager.py (1)
  • GraphQLSchemaManager (121-1126)
backend/infrahub/graphql/schema_sort.py (1)
  • sort_schema_ast (104-170)
backend/infrahub/cli/context.py (1)
  • CliContext (13-19)
backend/infrahub/graphql/api/endpoints.py (4)
backend/infrahub/graphql/manager.py (2)
  • Query (200-201)
  • get_graphql_schema (138-154)
backend/infrahub/api/dependencies.py (2)
  • get_branch_dep (94-98)
  • get_current_user (101-124)
backend/infrahub/graphql/schema_sort.py (1)
  • sort_schema_ast (104-170)
backend/infrahub/graphql/registry.py (1)
  • get_manager_for_branch (157-170)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: E2E-testing-version-upgrade / From 1.3.0
  • GitHub Check: backend-benchmark
  • GitHub Check: validate-generated-documentation
  • GitHub Check: backend-tests-functional
  • GitHub Check: E2E-testing-playwright
  • GitHub Check: documentation
  • GitHub Check: backend-tests-unit
  • GitHub Check: backend-tests-integration
  • GitHub Check: frontend-tests
  • GitHub Check: backend-docker-integration
🔇 Additional comments (1)
backend/tests/unit/graphql/test_schema_sort.py (1)

1-522: Excellent test coverage and structure.

This test file demonstrates strong engineering practices:

  • Comprehensive coverage of sorting behavior across definitions, fields, arguments, enum values, input fields, and interfaces
  • Edge cases properly tested (empty documents, single definitions, idempotence)
  • Well-organized fixtures that clearly separate test data
  • Descriptive test names that document expected behavior

The tests follow all coding guidelines with proper docstrings, type hints, and pytest patterns.

Comment on lines +21 to +33
def _sort_arguments(args: tuple[InputValueDefinitionNode, ...] | None) -> list[InputValueDefinitionNode] | None:
"""Sort arguments (filters) of a field alphabetically by name.
Args:
args: List of input value definition nodes to sort, or None.
Returns:
Sorted list of input value definition nodes, or None if input was None.
"""
if not args:
return None
return sorted(args, key=lambda a: a.name.value)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use tuples for AST child sequences to match graphql-core node contracts

graphql-core AST dataclasses model children as tuples (immutable). Returning/passing lists works incidentally but risks subtle bugs and type-check noise. Prefer tuples consistently.

-def _sort_arguments(args: tuple[InputValueDefinitionNode, ...] | None) -> list[InputValueDefinitionNode] | None:
+def _sort_arguments(args: tuple[InputValueDefinitionNode, ...] | None) -> tuple[InputValueDefinitionNode, ...] | None:
@@
-    if not args:
-        return None
-    return sorted(args, key=lambda a: a.name.value)
+    if not args:
+        return None
+    return tuple(sorted(args, key=lambda a: a.name.value))
@@
-def _sort_fields(fields: tuple[FieldDefinitionNode, ...] | None) -> list[FieldDefinitionNode] | None:
+def _sort_fields(fields: tuple[FieldDefinitionNode, ...] | None) -> tuple[FieldDefinitionNode, ...] | None:
@@
-    if not fields:
-        return None
-    sorted_fields = []
+    if not fields:
+        return None
+    sorted_fields: list[FieldDefinitionNode] = []
@@
-    return sorted_fields
+    return tuple(sorted_fields)
@@
-def _sort_enum_values(values: tuple[EnumValueDefinitionNode, ...] | None) -> list[EnumValueDefinitionNode] | None:
+def _sort_enum_values(values: tuple[EnumValueDefinitionNode, ...] | None) -> tuple[EnumValueDefinitionNode, ...] | None:
@@
-    if not values:
-        return None
-    return sorted(values, key=lambda v: v.name.value)
+    if not values:
+        return None
+    return tuple(sorted(values, key=lambda v: v.name.value))
@@
-def _sort_input_fields(fields: tuple[InputValueDefinitionNode, ...] | None) -> list[InputValueDefinitionNode] | None:
+def _sort_input_fields(fields: tuple[InputValueDefinitionNode, ...] | None) -> tuple[InputValueDefinitionNode, ...] | None:
@@
-    if not fields:
-        return None
-    return sorted(fields, key=lambda f: f.name.value)
+    if not fields:
+        return None
+    return tuple(sorted(fields, key=lambda f: f.name.value))
@@
-def _sort_interfaces(interfaces: tuple[NamedTypeNode, ...] | None) -> list[NamedTypeNode] | None:
+def _sort_interfaces(interfaces: tuple[NamedTypeNode, ...] | None) -> tuple[NamedTypeNode, ...] | None:
@@
-    if not interfaces:
-        return None
-    return sorted(interfaces, key=lambda i: i.name.value)
+    if not interfaces:
+        return None
+    return tuple(sorted(interfaces, key=lambda i: i.name.value))
@@
-                definition.__class__(
+                definition.__class__(
                     name=definition.name,
-                    interfaces=sorted_interfaces,
+                    interfaces=sorted_interfaces,
                     directives=definition.directives,
-                    fields=sorted_fields,
+                    fields=sorted_fields,
                     description=definition.description,
                     loc=definition.loc,
                 )
             )
@@
-                EnumTypeDefinitionNode(
+                EnumTypeDefinitionNode(
                     name=definition.name,
                     directives=definition.directives,
-                    values=sorted_values,
+                    values=sorted_values,
                     description=definition.description,
                     loc=definition.loc,
                 )
             )
@@
-                InputObjectTypeDefinitionNode(
+                InputObjectTypeDefinitionNode(
                     name=definition.name,
                     directives=definition.directives,
-                    fields=sorted_inputs,
+                    fields=sorted_inputs,
                     description=definition.description,
                     loc=definition.loc,
                 )
             )

This keeps runtime behavior and output identical while matching the AST’s immutability expectations. Based on learnings.

Also applies to: 35-60, 62-74, 76-88, 90-102, 135-143, 148-154, 159-166

🤖 Prompt for AI Agents
In backend/infrahub/graphql/schema_sort.py around lines 21-33 (and similarly for
ranges 35-60, 62-74, 76-88, 90-102, 135-143, 148-154, 159-166), the functions
currently return lists for AST child sequences; change them to return tuples to
match graphql-core immutable AST contracts: when sorting, convert the result of
sorted(...) to a tuple (e.g., tuple(sorted(...))) and update type hints to use
tuple[...] | None where appropriate so callers and type checkers see immutable
sequences.

Comment on lines +318 to +373
def test_sort_schema_ast_preserves_metadata():
"""Test that sorting preserves important metadata like directives and descriptions."""
# Test with a basic schema to ensure the function doesn't crash
# and that the structure is preserved
basic_schema = """
type User {
email: String!
name: String!
age: Int
}
type Post {
title: String!
content: String
author: User!
}
enum Status {
ACTIVE
INACTIVE
PENDING
}
input CreateUserInput {
email: String!
name: String!
age: Int
}
"""

doc = parse(basic_schema)
result = sort_schema_ast(doc)

# Check that the function doesn't crash and returns a valid DocumentNode
assert isinstance(result, DocumentNode)
assert len(result.definitions) == 4 # User, Post, Status, CreateUserInput

# Check that all definitions are sorted by name
definition_names = []
for definition in result.definitions:
if hasattr(definition, "name") and definition.name:
definition_names.append(definition.name.value)

assert definition_names == sorted(definition_names)

# Check that fields within types are sorted
for definition in result.definitions:
if hasattr(definition, "fields") and definition.fields:
field_names = [field.name.value for field in definition.fields]
assert field_names == sorted(field_names)

# Check that enum values are sorted
for definition in result.definitions:
if hasattr(definition, "values") and definition.values:
enum_values = [value.name.value for value in definition.values]
assert enum_values == sorted(enum_values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update docstring to match actual test behavior.

The docstring states that the test verifies preservation of "directives and descriptions," but the test schema contains neither directives nor descriptions—it only validates basic sorting and structure preservation.

Apply this diff to align the docstring with the test implementation:

-def test_sort_schema_ast_preserves_metadata():
-    """Test that sorting preserves important metadata like directives and descriptions."""
+def test_sort_schema_ast_preserves_metadata():
+    """Test that sorting preserves the document structure without data loss."""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_sort_schema_ast_preserves_metadata():
"""Test that sorting preserves important metadata like directives and descriptions."""
# Test with a basic schema to ensure the function doesn't crash
# and that the structure is preserved
basic_schema = """
type User {
email: String!
name: String!
age: Int
}
type Post {
title: String!
content: String
author: User!
}
enum Status {
ACTIVE
INACTIVE
PENDING
}
input CreateUserInput {
email: String!
name: String!
age: Int
}
"""
doc = parse(basic_schema)
result = sort_schema_ast(doc)
# Check that the function doesn't crash and returns a valid DocumentNode
assert isinstance(result, DocumentNode)
assert len(result.definitions) == 4 # User, Post, Status, CreateUserInput
# Check that all definitions are sorted by name
definition_names = []
for definition in result.definitions:
if hasattr(definition, "name") and definition.name:
definition_names.append(definition.name.value)
assert definition_names == sorted(definition_names)
# Check that fields within types are sorted
for definition in result.definitions:
if hasattr(definition, "fields") and definition.fields:
field_names = [field.name.value for field in definition.fields]
assert field_names == sorted(field_names)
# Check that enum values are sorted
for definition in result.definitions:
if hasattr(definition, "values") and definition.values:
enum_values = [value.name.value for value in definition.values]
assert enum_values == sorted(enum_values)
def test_sort_schema_ast_preserves_metadata():
"""Test that sorting preserves the document structure without data loss."""
# Test with a basic schema to ensure the function doesn't crash
# and that the structure is preserved
basic_schema = """
type User {
email: String!
name: String!
age: Int
}
type Post {
title: String!
content: String
author: User!
}
enum Status {
ACTIVE
INACTIVE
PENDING
}
input CreateUserInput {
email: String!
name: String!
age: Int
}
"""
doc = parse(basic_schema)
result = sort_schema_ast(doc)
# Check that the function doesn't crash and returns a valid DocumentNode
assert isinstance(result, DocumentNode)
assert len(result.definitions) == 4 # User, Post, Status, CreateUserInput
# Check that all definitions are sorted by name
definition_names = []
for definition in result.definitions:
if hasattr(definition, "name") and definition.name:
definition_names.append(definition.name.value)
assert definition_names == sorted(definition_names)
# Check that fields within types are sorted
for definition in result.definitions:
if hasattr(definition, "fields") and definition.fields:
field_names = [field.name.value for field in definition.fields]
assert field_names == sorted(field_names)
# Check that enum values are sorted
for definition in result.definitions:
if hasattr(definition, "values") and definition.values:
enum_values = [value.name.value for value in definition.values]
assert enum_values == sorted(enum_values)
🤖 Prompt for AI Agents
backend/tests/unit/graphql/test_schema_sort.py around lines 318 to 373: the
test's docstring incorrectly claims it verifies preservation of "directives and
descriptions" even though the test schema contains neither; update the docstring
to accurately describe what the test actually does (verifies that
sort_schema_ast doesn't crash, returns a DocumentNode, preserves the number of
definitions, and sorts definitions, type fields, and enum values), e.g. replace
the docstring with a concise one stating it tests basic sorting and structure
preservation of the schema AST.

@dgarros dgarros force-pushed the dga-20251015-export-graphql branch from a0e273d to 144e38d Compare October 26, 2025 18:01
@dgarros dgarros requested review from a team as code owners October 26, 2025 18:01
@dgarros dgarros changed the base branch from develop to release-1.5 October 26, 2025 18:01
@github-actions github-actions bot added the type/documentation Improvements or additions to documentation label Oct 26, 2025
@dgarros dgarros requested a review from ogenstad October 27, 2025 09:49
config_file: str = typer.Option("infrahub.toml", envvar="INFRAHUB_CONFIG"),
out: Path = typer.Option("schema.graphql"), # noqa: B008
) -> None:
"""Export the GraphQL schema to a file."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity I think we should still say "Export the core GraphQL schema to a file."

@dgarros dgarros merged commit cf2e6ce into release-1.5 Oct 28, 2025
70 of 71 checks passed
@dgarros dgarros deleted the dga-20251015-export-graphql branch October 28, 2025 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

group/backend Issue related to the backend (API Server, Git Agent) group/ci Issue related to the CI pipeline type/documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants