-
Notifications
You must be signed in to change notification settings - Fork 6
Fixes race conditions offset #514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Walkthrough
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## stable #514 +/- ##
=======================================
Coverage 75.74% 75.74%
=======================================
Files 100 100
Lines 8846 8846
Branches 1732 1732
=======================================
Hits 6700 6700
Misses 1670 1670
Partials 476 476
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
changelog/+race-condition.fixed.md (1)
1-1: Clarify changelog entry and explicitly call out offset=None vs 0 behavior.Make the fix description precise so readers know what changed and why.
Apply this diff:
-Update offset in process_page() which was causing a race condition in rare case. +Fix pagination offset handling in filters(): preserve explicit offset values (including 0) and only fall back to computed page_offset when offset is None. +This prevents rare duplicate/missed pages in parallel pagination and aligns async/sync clients. (PR #514)infrahub_sdk/client.py (6)
822-834: Avoid scheduling all pages in parallel when caller provides offset/limit.With an explicit offset or limit, current batch logic still schedules every page, and after this change each scheduled task will use the same explicit offset. That causes duplicate requests and unnecessary load.
Apply this diff to short-circuit to the single-page path in batch mode:
async def process_batch() -> tuple[list[InfrahubNode], list[InfrahubNode]]: """Process queries in parallel mode.""" nodes = [] related_nodes = [] batch_process = await self.create_batch() + # If the caller provided explicit pagination (offset/limit), use the non-batch path + # to avoid scheduling duplicate pages with the same explicit offset. + if offset is not None or limit is not None: + return await process_non_batch() count = await self.count(kind=schema.kind, branch=branch, partial_match=partial_match, **filters) total_pages = (count + pagination_size - 1) // pagination_size
853-856: Stop after the last page without making an extra empty request.When count is an exact multiple of page size,
remaining_itemsbecomes 0 on the last page; using< 0triggers one additional empty fetch.Apply this diff:
- remaining_items = response[schema.kind].get("count", 0) - (page_offset + pagination_size) - if remaining_items < 0 or offset is not None or limit is not None: + remaining_items = response[schema.kind].get("count", 0) - (page_offset + pagination_size) + if remaining_items <= 0 or offset is not None or limit is not None: has_remaining_items = False
1986-1999: Mirror the async optimization: avoid batch when offset/limit is provided.As in the async path, batch-mode with explicit offset/limit schedules redundant pages and re-queries the same offset.
Apply this diff:
def process_batch() -> tuple[list[InfrahubNodeSync], list[InfrahubNodeSync]]: """Process queries in parallel mode.""" nodes = [] related_nodes = [] batch_process = self.create_batch() + # Use the non-batch path if explicit pagination was requested. + if offset is not None or limit is not None: + return process_non_batch() count = self.count(kind=schema.kind, branch=branch, partial_match=partial_match, **filters) total_pages = (count + pagination_size - 1) // pagination_size
2019-2022: Stop after the last page without extra request (sync).Apply this diff:
- if remaining_items < 0 or offset is not None or limit is not None: + if remaining_items <= 0 or offset is not None or limit is not None: has_remaining_items = False
1294-1296: Fix docstring parameter descriptions (copy/paste errors).The docstrings for allocate_* methods mislabel tracker/raise_for_error/timeout. Quick cleanup improves maintainability and developer UX.
Apply the following diffs:
Async allocate_next_ip_address (Lines 1294-1296):
- timeout (int, optional): Flag to indicate whether to populate the store with the retrieved nodes. - tracker (str, optional): The offset for pagination. - raise_for_error (bool, optional): The limit for pagination. + timeout (int, optional): Overrides default request timeout (seconds). + tracker (str, optional): Tracker name to annotate the request. + raise_for_error (bool, optional): Raise on non-2xx GraphQL errors. Defaults to True.Async allocate_next_ip_prefix (Lines 1445-1448):
- timeout (int, optional): Flag to indicate whether to populate the store with the retrieved nodes. - tracker (str, optional): The offset for pagination. - raise_for_error (bool, optional): The limit for pagination. + timeout (int, optional): Overrides default request timeout (seconds). + tracker (str, optional): Tracker name to annotate the request. + raise_for_error (bool, optional): Raise on non-2xx GraphQL errors. Defaults to True.Sync allocate_next_ip_address (Lines 2431-2434):
- timeout (int, optional): Flag to indicate whether to populate the store with the retrieved nodes. - tracker (str, optional): The offset for pagination. - raise_for_error (bool, optional): The limit for pagination. + timeout (int, optional): Overrides default request timeout (seconds). + tracker (str, optional): Tracker name to annotate the request. + raise_for_error (bool, optional): Raise on non-2xx GraphQL errors. Defaults to True.Sync allocate_next_ip_prefix (Lines 2579-2582):
- timeout (int, optional): Flag to indicate whether to populate the store with the retrieved nodes. - tracker (str, optional): The offset for pagination. - raise_for_error (bool, optional): The limit for pagination. + timeout (int, optional): Overrides default request timeout (seconds). + tracker (str, optional): Tracker name to annotate the request. + raise_for_error (bool, optional): Raise on non-2xx GraphQL errors. Defaults to True.Also applies to: 1445-1448, 2431-2434, 2579-2582
782-872: Optional: add tests for offset semantics to prevent regressions.Recommend adding unit tests that assert:
- offset=None paginates across pages
- offset=0 returns the first page without overriding to page_offset
- offset=N with parallel=False returns exactly one page at N
I can scaffold tests if helpful.
Also applies to: 1905-2038
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
changelog/+race-condition.fixed.md(1 hunks)infrahub_sdk/client.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
When implementing Infrahub checks, subclass InfrahubCheck and override validate(data); do not implement or rely on a check() method
Files:
infrahub_sdk/client.py
infrahub_sdk/client.py
📄 CodeRabbit inference engine (CLAUDE.md)
infrahub_sdk/client.py: Use HTTPX for transport with proxy support (single proxy or HTTP/HTTPS mounts)
Support authentication via API tokens or JWT with automatic refresh
Files:
infrahub_sdk/client.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unit-tests (3.12)
- GitHub Check: unit-tests (3.13)
- GitHub Check: integration-tests-latest-infrahub
- GitHub Check: Cloudflare Pages
🔇 Additional comments (2)
infrahub_sdk/client.py (2)
1956-1967: Good fix (sync): preserve explicit offset=0; only fall back to page_offset when offset is None.Mirrors the async correction and prevents 0 from being overridden.
792-804: No remaining legacy offset fallbacks found – ready to mergeThe search for any remaining
offset or page_offsetpatterns returned no matches, confirming that all paging calls now correctly usepage_offset if offset is None else offsetand preserve intentional
offset=0.
Deploying infrahub-sdk-python with
|
| Latest commit: |
527c0a5
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://d54f3968.infrahub-sdk-python.pages.dev |
| Branch Preview URL: | https://bkr-race-conditions-offset.infrahub-sdk-python.pages.dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
infrahub_sdk/node/node.py (1)
400-400: Defensive deepcopy of filters eliminates shared-mutation/race issuesUsing deepcopy when seeding data["@filters"] prevents this function from mutating the caller’s filters dict (and any nested structures) when order/offset/limit/partial_match are injected below. This directly addresses concurrency issues when the same filters object is reused across tasks/threads.
Optional: add a brief comment to document the intent so future refactors don’t “optimize” this back to a shallow copy.
Apply this small doc comment:
- data["@filters"] = deepcopy(filters) if filters is not None else {} + # Defensive copy: avoid mutating caller-provided filters when we add paging/sorting keys. + # Use deepcopy because nested filters (e.g., boolean expressions) may also be extended later. + data["@filters"] = deepcopy(filters) if filters is not None else {}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
infrahub_sdk/node/node.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
When implementing Infrahub checks, subclass InfrahubCheck and override validate(data); do not implement or rely on a check() method
Files:
infrahub_sdk/node/node.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: unit-tests (3.13)
- GitHub Check: unit-tests (3.11)
- GitHub Check: unit-tests (3.10)
- GitHub Check: unit-tests (3.12)
- GitHub Check: integration-tests-latest-infrahub
- GitHub Check: Cloudflare Pages
🔇 Additional comments (3)
infrahub_sdk/node/node.py (3)
4-4: Importing deepcopy to support defensive copying — good calldeepcopy is needed for the new filters handling; copy remains used at Line 342. Import set is correct and minimal.
402-409: Offset/limit handling preserves explicit 0 — aligns with client-side fixThe conditional writes ensure explicit offset=0 is respected (not treated as falsy), and limit is applied only when provided. This matches the PR’s change in client.py to pass page_offset only when offset is None. No issues spotted.
416-418: partial_match flagging is side-effect free with the new deepcopySetting partial_match on the copied filter avoids contaminating any shared filters dict upstream. This is consistent with the race-condition fix.
Summary by CodeRabbit
Bug Fixes
Documentation