Skip to content

Conversation

@BeArchiTek
Copy link
Contributor

@BeArchiTek BeArchiTek commented Aug 26, 2025

Summary by CodeRabbit

  • Bug Fixes

    • Resolved a rare race condition in pagination by correctly honoring explicit offset values (including 0), falling back to computed offsets only when none is provided.
    • Prevented unintended mutation of caller-provided filter data by ensuring filters are copied before modification, avoiding side effects.
  • Documentation

    • Added a changelog entry describing the pagination and filter-mutation fixes.

@BeArchiTek BeArchiTek requested a review from a team August 26, 2025 11:32
@coderabbitai
Copy link

coderabbitai bot commented Aug 26, 2025

Walkthrough

  • Added changelog entry at changelog/+race-condition.fixed.md documenting a race-condition fix related to offset handling.
  • Modified infrahub_sdk/client.py to pass page_offset if offset is None else offset to generate_query_data in both async and sync paths, preserving explicit offsets (including 0).
  • Modified infrahub_sdk/node/node.py to use a deepcopy of filters when building @filters, preventing mutation of caller-provided dicts.
  • No public/exported signatures changed.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bkr-race-conditions-offset

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary or Summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@codecov
Copy link

codecov bot commented Aug 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

@@           Coverage Diff           @@
##           stable     #514   +/-   ##
=======================================
  Coverage   75.74%   75.74%           
=======================================
  Files         100      100           
  Lines        8846     8846           
  Branches     1732     1732           
=======================================
  Hits         6700     6700           
  Misses       1670     1670           
  Partials      476      476           
Flag Coverage Δ
integration-tests 34.65% <50.00%> (-0.02%) ⬇️
python-3.10 48.19% <50.00%> (-0.03%) ⬇️
python-3.11 48.19% <50.00%> (-0.03%) ⬇️
python-3.12 48.19% <50.00%> (ø)
python-3.13 48.16% <50.00%> (-0.03%) ⬇️
python-3.9 46.88% <50.00%> (+0.02%) ⬆️
python-filler-3.12 25.09% <50.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
infrahub_sdk/client.py 69.48% <ø> (ø)
infrahub_sdk/node/node.py 75.96% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
changelog/+race-condition.fixed.md (1)

1-1: Clarify changelog entry and explicitly call out offset=None vs 0 behavior.

Make the fix description precise so readers know what changed and why.

Apply this diff:

-Update offset in process_page() which was causing a race condition in rare case.
+Fix pagination offset handling in filters(): preserve explicit offset values (including 0) and only fall back to computed page_offset when offset is None.
+This prevents rare duplicate/missed pages in parallel pagination and aligns async/sync clients. (PR #514)
infrahub_sdk/client.py (6)

822-834: Avoid scheduling all pages in parallel when caller provides offset/limit.

With an explicit offset or limit, current batch logic still schedules every page, and after this change each scheduled task will use the same explicit offset. That causes duplicate requests and unnecessary load.

Apply this diff to short-circuit to the single-page path in batch mode:

 async def process_batch() -> tuple[list[InfrahubNode], list[InfrahubNode]]:
   """Process queries in parallel mode."""
   nodes = []
   related_nodes = []
   batch_process = await self.create_batch()
+  # If the caller provided explicit pagination (offset/limit), use the non-batch path
+  # to avoid scheduling duplicate pages with the same explicit offset.
+  if offset is not None or limit is not None:
+      return await process_non_batch()
   count = await self.count(kind=schema.kind, branch=branch, partial_match=partial_match, **filters)
   total_pages = (count + pagination_size - 1) // pagination_size

853-856: Stop after the last page without making an extra empty request.

When count is an exact multiple of page size, remaining_items becomes 0 on the last page; using < 0 triggers one additional empty fetch.

Apply this diff:

-                remaining_items = response[schema.kind].get("count", 0) - (page_offset + pagination_size)
-                if remaining_items < 0 or offset is not None or limit is not None:
+                remaining_items = response[schema.kind].get("count", 0) - (page_offset + pagination_size)
+                if remaining_items <= 0 or offset is not None or limit is not None:
                     has_remaining_items = False

1986-1999: Mirror the async optimization: avoid batch when offset/limit is provided.

As in the async path, batch-mode with explicit offset/limit schedules redundant pages and re-queries the same offset.

Apply this diff:

 def process_batch() -> tuple[list[InfrahubNodeSync], list[InfrahubNodeSync]]:
   """Process queries in parallel mode."""
   nodes = []
   related_nodes = []
   batch_process = self.create_batch()
+  # Use the non-batch path if explicit pagination was requested.
+  if offset is not None or limit is not None:
+      return process_non_batch()
 
   count = self.count(kind=schema.kind, branch=branch, partial_match=partial_match, **filters)
   total_pages = (count + pagination_size - 1) // pagination_size

2019-2022: Stop after the last page without extra request (sync).

Apply this diff:

-                if remaining_items < 0 or offset is not None or limit is not None:
+                if remaining_items <= 0 or offset is not None or limit is not None:
                     has_remaining_items = False

1294-1296: Fix docstring parameter descriptions (copy/paste errors).

The docstrings for allocate_* methods mislabel tracker/raise_for_error/timeout. Quick cleanup improves maintainability and developer UX.

Apply the following diffs:

Async allocate_next_ip_address (Lines 1294-1296):

-            timeout (int, optional): Flag to indicate whether to populate the store with the retrieved nodes.
-            tracker (str, optional): The offset for pagination.
-            raise_for_error (bool, optional): The limit for pagination.
+            timeout (int, optional): Overrides default request timeout (seconds).
+            tracker (str, optional): Tracker name to annotate the request.
+            raise_for_error (bool, optional): Raise on non-2xx GraphQL errors. Defaults to True.

Async allocate_next_ip_prefix (Lines 1445-1448):

-            timeout (int, optional): Flag to indicate whether to populate the store with the retrieved nodes.
-            tracker (str, optional): The offset for pagination.
-            raise_for_error (bool, optional): The limit for pagination.
+            timeout (int, optional): Overrides default request timeout (seconds).
+            tracker (str, optional): Tracker name to annotate the request.
+            raise_for_error (bool, optional): Raise on non-2xx GraphQL errors. Defaults to True.

Sync allocate_next_ip_address (Lines 2431-2434):

-            timeout (int, optional): Flag to indicate whether to populate the store with the retrieved nodes.
-            tracker (str, optional): The offset for pagination.
-            raise_for_error (bool, optional): The limit for pagination.
+            timeout (int, optional): Overrides default request timeout (seconds).
+            tracker (str, optional): Tracker name to annotate the request.
+            raise_for_error (bool, optional): Raise on non-2xx GraphQL errors. Defaults to True.

Sync allocate_next_ip_prefix (Lines 2579-2582):

-            timeout (int, optional): Flag to indicate whether to populate the store with the retrieved nodes.
-            tracker (str, optional): The offset for pagination.
-            raise_for_error (bool, optional): The limit for pagination.
+            timeout (int, optional): Overrides default request timeout (seconds).
+            tracker (str, optional): Tracker name to annotate the request.
+            raise_for_error (bool, optional): Raise on non-2xx GraphQL errors. Defaults to True.

Also applies to: 1445-1448, 2431-2434, 2579-2582


782-872: Optional: add tests for offset semantics to prevent regressions.

Recommend adding unit tests that assert:

  • offset=None paginates across pages
  • offset=0 returns the first page without overriding to page_offset
  • offset=N with parallel=False returns exactly one page at N

I can scaffold tests if helpful.

Also applies to: 1905-2038

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1359a2d and 0aee514.

📒 Files selected for processing (2)
  • changelog/+race-condition.fixed.md (1 hunks)
  • infrahub_sdk/client.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

When implementing Infrahub checks, subclass InfrahubCheck and override validate(data); do not implement or rely on a check() method

Files:

  • infrahub_sdk/client.py
infrahub_sdk/client.py

📄 CodeRabbit inference engine (CLAUDE.md)

infrahub_sdk/client.py: Use HTTPX for transport with proxy support (single proxy or HTTP/HTTPS mounts)
Support authentication via API tokens or JWT with automatic refresh

Files:

  • infrahub_sdk/client.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit-tests (3.12)
  • GitHub Check: unit-tests (3.13)
  • GitHub Check: integration-tests-latest-infrahub
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (2)
infrahub_sdk/client.py (2)

1956-1967: Good fix (sync): preserve explicit offset=0; only fall back to page_offset when offset is None.

Mirrors the async correction and prevents 0 from being overridden.


792-804: No remaining legacy offset fallbacks found – ready to merge

The search for any remaining offset or page_offset patterns returned no matches, confirming that all paging calls now correctly use

page_offset if offset is None else offset

and preserve intentional offset=0.

@BeArchiTek BeArchiTek closed this Aug 26, 2025
@BeArchiTek BeArchiTek reopened this Aug 26, 2025
@cloudflare-workers-and-pages
Copy link

Deploying infrahub-sdk-python with  Cloudflare Pages  Cloudflare Pages

Latest commit: 527c0a5
Status: ✅  Deploy successful!
Preview URL: https://d54f3968.infrahub-sdk-python.pages.dev
Branch Preview URL: https://bkr-race-conditions-offset.infrahub-sdk-python.pages.dev

View logs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
infrahub_sdk/node/node.py (1)

400-400: Defensive deepcopy of filters eliminates shared-mutation/race issues

Using deepcopy when seeding data["@filters"] prevents this function from mutating the caller’s filters dict (and any nested structures) when order/offset/limit/partial_match are injected below. This directly addresses concurrency issues when the same filters object is reused across tasks/threads.

Optional: add a brief comment to document the intent so future refactors don’t “optimize” this back to a shallow copy.

Apply this small doc comment:

-        data["@filters"] = deepcopy(filters) if filters is not None else {}
+        # Defensive copy: avoid mutating caller-provided filters when we add paging/sorting keys.
+        # Use deepcopy because nested filters (e.g., boolean expressions) may also be extended later.
+        data["@filters"] = deepcopy(filters) if filters is not None else {}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0aee514 and 527c0a5.

📒 Files selected for processing (1)
  • infrahub_sdk/node/node.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

When implementing Infrahub checks, subclass InfrahubCheck and override validate(data); do not implement or rely on a check() method

Files:

  • infrahub_sdk/node/node.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: unit-tests (3.13)
  • GitHub Check: unit-tests (3.11)
  • GitHub Check: unit-tests (3.10)
  • GitHub Check: unit-tests (3.12)
  • GitHub Check: integration-tests-latest-infrahub
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (3)
infrahub_sdk/node/node.py (3)

4-4: Importing deepcopy to support defensive copying — good call

deepcopy is needed for the new filters handling; copy remains used at Line 342. Import set is correct and minimal.


402-409: Offset/limit handling preserves explicit 0 — aligns with client-side fix

The conditional writes ensure explicit offset=0 is respected (not treated as falsy), and limit is applied only when provided. This matches the PR’s change in client.py to pass page_offset only when offset is None. No issues spotted.


416-418: partial_match flagging is side-effect free with the new deepcopy

Setting partial_match on the copied filter avoids contaminating any shared filters dict upstream. This is consistent with the race-condition fix.

@BeArchiTek BeArchiTek merged commit 23a55e2 into stable Aug 26, 2025
38 of 48 checks passed
@BeArchiTek BeArchiTek deleted the bkr-race-conditions-offset branch August 26, 2025 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants