fix: respect max_parallel_requests in HTTP connection pool size by przemekboruta · Pull Request #460 · NVIDIA-NeMo/DataDesigner

przemekboruta · 2026-03-25T15:11:00Z

Summary

Fixes #459

Pass a pre-configured httpx.HTTPTransport / httpx.AsyncHTTPTransport with the correct limits into RetryTransport instead of letting it create its own pool with httpx defaults (100 connections)
Add transport parameter to create_retry_transport and forward it to RetryTransport
Remove the now-ineffective limits= argument from httpx.Client / httpx.AsyncClient constructors (httpx silently ignores limits when a custom transport is provided)

Root cause

self._limits was calculated correctly from max_parallel_requests but passed to httpx.Client(limits=...), which silently ignores it when a custom transport= is provided. RetryTransport was then creating its own internal HTTPTransport with httpx defaults (max_connections=100), regardless of the configured value.

Test plan

test_create_retry_transport_forwards_sync_transport — provided HTTPTransport is stored as _sync_transport
test_create_retry_transport_forwards_async_transport — provided AsyncHTTPTransport is stored as _async_transport
test_create_retry_transport_no_transport_creates_defaults — default behaviour unchanged when no transport is passed
test_sync_client_pool_size_respects_max_parallel_requests — regression: pool has 600 connections for max_parallel_requests=300
test_async_client_pool_size_respects_max_parallel_requests — same for async client

🤖 Generated with Claude Code

greptile-apps · 2026-03-25T15:18:57Z

Greptile Summary

This PR correctly fixes the root cause of issue #459: httpx.Client(limits=...) silently ignores limits when a custom transport= is provided, so the connection pool was always sized at httpx's default of 100 regardless of max_parallel_requests.

The fix pre-constructs a properly sized httpx.HTTPTransport(limits=self._limits) (for sync) or httpx.AsyncHTTPTransport(limits=self._limits) (for async) and passes it directly to RetryTransport. Based on RetryTransport.__init__'s behaviour — it stores the incoming transport into _sync_transport or _async_transport via isinstance check, and sets the other to None — both paths are wired correctly. The ClientConcurrencyMode guard in HttpModelClient ensures the None-transport side is never exercised.

Key changes:

http_model_client.py: Builds a typed inner transport before calling create_retry_transport, removes the now-redundant limits= kwarg from httpx.Client/httpx.AsyncClient.
retry.py: Adds transport parameter to create_retry_transport and forwards it directly to RetryTransport(transport=transport, ...).
test_native_http_clients.py: Adds limits property to HttpModelClient and one new unit test verifying the _limits object is computed correctly from max_parallel_requests.

Note on test coverage: The four regression tests listed in the PR description remain unchecked and unimplemented. The single test added only verifies that self._limits is computed correctly — a calculation that was already correct before this PR. test_retry.py also has no coverage for the new transport= parameter of create_retry_transport. While the fix itself is logically correct, the regression safety net is weak.

Confidence Score: 5/5

Fix is logically correct and safe to merge; no runtime regressions introduced.

The implementation correctly constructs a typed inner transport (HTTPTransport or AsyncHTTPTransport) with the right limits before wrapping it in RetryTransport. RetryTransport's isinstance-based dispatch stores each transport in the right slot (_sync_transport / _async_transport), and the concurrency-mode guard prevents cross-mode usage. No new P0/P1 issues were found. The only gap — that the single new test verifies _limits computation rather than actual pool size — was already raised and discussed in prior review threads, and does not represent a defect in the production code.

test_native_http_clients.py — the transport-wiring regression tests listed in the PR description remain unimplemented.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py	Core fix: constructs typed inner transports with correct limits before creating RetryTransport; adds `limits` property exposing `_limits`; removes ineffective `limits=` kwarg from httpx client constructors.
packages/data-designer-engine/src/data_designer/engine/models/clients/retry.py	Adds `transport` keyword-only parameter to `create_retry_transport` and forwards it to `RetryTransport`; uses `TYPE_CHECKING` guard to avoid a heavy import at runtime; logic is correct.
packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py	Adds one test that verifies `client.limits.max_connections` equals the expected value — but `_limits` was already computed correctly pre-fix, so this does not regression-guard the actual transport-pool wiring; the four listed regression tests remain unimplemented.

Sequence Diagram

sequenceDiagram
    participant C as HttpModelClient
    participant IT as httpx.HTTPTransport (inner)
    participant RT as RetryTransport
    participant HC as httpx.Client

    Note over C: _get_sync_client() first call
    C->>IT: HTTPTransport(limits=self._limits)
    Note over IT: _pool._max_connections = 2x max_parallel_requests
    C->>RT: create_retry_transport(config, transport=inner)
    Note over RT: _sync_transport = inner
    C->>HC: httpx.Client(transport=RT, timeout=...)
    HC-->>C: client

    Note over C: _get_async_client() first call
    C->>IT: AsyncHTTPTransport(limits=self._limits)
    Note over IT: _pool._max_connections = 2x max_parallel_requests
    C->>RT: create_retry_transport(config, transport=inner)
    Note over RT: _async_transport = inner
    C->>HC: AsyncClient(transport=RT, timeout=...)
    HC-->>C: async_client

_{Reviews (5): Last reviewed commit: "Merge branch 'main' into fix/connection-..." | Re-trigger Greptile}

packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py

Address Greptile P2 review comments on PR NVIDIA-NeMo#460: - Add docstring entry for the new `transport` parameter in `create_retry_transport` explaining accepted types and None default - Add inline comments in pool-size regression tests explaining the private attribute chain (_sync/_async_transport → _pool → _max_connections) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

nabinchha

Thanks @przemekboruta — great catch, and thanks for opening both the issue and the PR! Please see findings below that need to be addressed before we can merge.

Critical — Must fix before merge

test_retry.py:87-109 — Tests reach into third-party private attributes; import httpx inside function bodies

What: The three new tests (test_create_retry_transport_forwards_sync_transport, test_create_retry_transport_forwards_async_transport, test_create_retry_transport_no_transport_creates_defaults) assert on _sync_transport and _async_transport — private attributes of RetryTransport, a third-party class from httpx_retries. They also import httpx inside function bodies, violating the project's module-level import standard.
Why: Tests should exercise public interfaces, not private internals of third-party libraries (AGENTS.md: "Tests should exercise public interfaces, not _-prefixed functions or classes"). These tests verify a one-line pass-through (RetryTransport(transport=transport, retry=retry)) — pure plumbing. The forwarding contract is already proven end-to-end by the pool-size regression tests in test_native_http_clients.py.
Suggestion: Drop all three tests. They only verify plumbing via private attributes of a third-party library. The pool-size regression tests (once reworked per the next finding) already cover the end-to-end contract.

test_native_http_clients.py:299-337 — Tests rely on private methods and 4-layer private attribute chains

What: The regression tests call _get_sync_client() / _get_async_client() (private), then walk _transport._sync_transport._pool._max_connections — four private attributes spanning three libraries (data_designer, httpx_retries, httpcore).
Why: Any internal rename in httpx_retries or httpcore silently breaks these tests without indicating a real regression. Tests should exercise public interfaces, not reach into private internals.
Suggestion: Three changes:
1. Expose limits as a public read-only property on HttpModelClient — it's already computed and stored, just behind a _ prefix. This is useful beyond tests (logging, diagnostics).
2. Test the contract, not the internals. The regression test should verify that HttpModelClient computes the right limits and passes them to the inner transport:
  - client.limits.max_connections == 600 (computation is correct)
  - Transport forwarding is already covered by the RetryTransport(transport=transport, ...) call — no separate assertion needed.
3. Don't call private init methods from tests. If the test needs to trigger lazy initialization, test through the public completion() / acompletion() methods with a mocked httpx client (the existing lazy-init tests already do this).

What Looks Good

Root cause analysis is precise. The three-layer problem (limits computed correctly, passed to wrong place, RetryTransport creates default pool) is correctly identified and the fix targets the right layer.
The fix is minimal and surgical. Works with httpx_retries's existing transport parameter rather than fighting the library. Removing the dead limits= kwarg is the right cleanup.
The production code changes are clean. Both retry.py and http_model_client.py changes are correct and well-structured.

Verdict

Needs changes — The core fix is correct and well-understood. The test changes need rework: (1) drop the three transport-forwarding tests in test_retry.py that reach into third-party private attributes, and (2) rework the pool-size regression tests in test_native_http_clients.py to assert on a public limits property rather than walking private internals across three libraries.

packages/data-designer-engine/src/data_designer/engine/models/clients/retry.py

github-actions · 2026-03-25T20:14:40Z

All contributors have signed the DCO ✍️ ✅
_{Posted by the DCO Assistant Lite bot.}

przemekboruta · 2026-03-25T20:15:54Z

I have read the DCO document and I hereby sign the DCO.

Pass a pre-configured HTTPTransport/AsyncHTTPTransport with the correct limits into RetryTransport instead of letting it create its own pool with httpx defaults (100 connections). Previously, the limits calculated from max_parallel_requests were passed to httpx.Client(limits=...) which silently ignores them when a custom transport is provided. Fixes NVIDIA-NeMo#459 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Przemysław <przemekboruta@interia.pl>

Address Greptile P2 review comments on PR NVIDIA-NeMo#460: - Add docstring entry for the new `transport` parameter in `create_retry_transport` explaining accepted types and None default - Add inline comments in pool-size regression tests explaining the private attribute chain (_sync/_async_transport → _pool → _max_connections) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Przemysław <przemekboruta@interia.pl>

- Drop three tests from test_retry.py that reached into private attributes of third-party RetryTransport (_sync_transport, _async_transport); the end-to-end contract is covered by the pool-size regression test - Expose a public `limits` property on HttpModelClient so tests and diagnostic code can assert the pool configuration without walking private attribute chains across three libraries - Replace two private-chain pool assertions with a single `client.limits.max_connections == 600` check against the new property - Trim "inner" from the transport docstring entry (nabinchha suggestion) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Przemysław <przemekboruta@interia.pl>

greptile-apps · 2026-03-25T20:23:25Z

packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py

+def test_client_limits_respect_max_parallel_requests() -> None:
+    """Connection pool limits must reflect max_parallel_requests (regression for issue #459).
+
+    pool_max = max(32, 2 * max_parallel_requests) = max(32, 600) = 600
+    """
+    client = OpenAICompatibleClient(
+        provider_name=_OPENAI_PROVIDER,
+        endpoint=_OPENAI_ENDPOINT,
+        api_key="sk-test",
+        max_parallel_requests=300,
+        concurrency_mode=ClientConcurrencyMode.SYNC,
+    )
+    assert client.limits.max_connections == 600


Regression test doesn't verify the actual fix

client.limits.max_connections reads self._limits — a value that was computed correctly even before this PR. The bug was that _limits was never forwarded to the underlying httpcore connection pool. This test passes on the old (broken) code just as easily as on the new code, so it provides no regression protection.

To actually verify the fix, the test needs to trigger lazy transport initialization and then inspect the pool's _max_connections. The PR description lists exactly these missing tests (all four checkboxes are unchecked):

test_sync_client_pool_size_respects_max_parallel_requests

test_async_client_pool_size_respects_max_parallel_requests

test_create_retry_transport_forwards_sync_transport

test_create_retry_transport_forwards_async_transport

The existing test is fine as a unit-level sanity check for the limits property, but it should not be the sole regression guard for this bug.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py Line: 299-311 Comment: **Regression test doesn't verify the actual fix** `client.limits.max_connections` reads `self._limits` — a value that was computed correctly even *before* this PR. The bug was that `_limits` was never forwarded to the underlying `httpcore` connection pool. This test passes on the old (broken) code just as easily as on the new code, so it provides no regression protection. To actually verify the fix, the test needs to trigger lazy transport initialization and then inspect the pool's `_max_connections`. The PR description lists exactly these missing tests (all four checkboxes are unchecked): - `test_sync_client_pool_size_respects_max_parallel_requests` - `test_async_client_pool_size_respects_max_parallel_requests` - `test_create_retry_transport_forwards_sync_transport` - `test_create_retry_transport_forwards_async_transport` The existing test is fine as a unit-level sanity check for the `limits` property, but it should not be the sole regression guard for this bug. How can I resolve this? If you propose a fix, please make it concise.

Good catch — greptile is right that this test would pass on the pre-fix code since _limits was always computed correctly.

However, I think the right path here is a mock-capture approach that verifies the wiring without touching any private API.

The fix works by explicitly constructing httpx.HTTPTransport(limits=<correct limits>) and passing it into RetryTransport — something the old code never did. So we can patch the constructor and assert it received the right limits:

@patch("data_designer.engine.models.clients.adapters.http_model_client.lazy.httpx.HTTPTransport") @patch(_SYNC_CLIENT_PATCH) def test_sync_pool_limits_forwarded_to_transport( mock_client_cls: MagicMock, mock_transport_cls: MagicMock, ) -> None: """Regression for #459: limits must reach HTTPTransport, not just httpx.Client.""" mock_client_cls.return_value = MagicMock( post=MagicMock(return_value=mock_httpx_response(_make_openai_chat_response())) ) client = OpenAICompatibleClient( provider_name=_OPENAI_PROVIDER, endpoint=_OPENAI_ENDPOINT, api_key="sk-test", max_parallel_requests=300, concurrency_mode=ClientConcurrencyMode.SYNC, ) client.completion(_make_chat_request(_OPENAI_MODEL)) mock_transport_cls.assert_called_once() limits = mock_transport_cls.call_args.kwargs["limits"] assert limits.max_connections == 600 assert limits.max_keepalive_connections == 300

This fails on the pre-fix code (assert_called_once() fails because the old code never constructs HTTPTransport — RetryTransport() creates a default one internally) and passes on the fixed code, with zero private attributes.

Same pattern works for async by patching AsyncHTTPTransport and triggering via acompletion(). I'd keep the existing client.limits test as a unit check for the computation and add this as the actual regression guard.

nabinchha

@przemekboruta thanks for the PR! In the interest of getting the fix merged asap let's merge this. I can update the unit tests in a separat PR!

greptile-apps bot reviewed Mar 25, 2026

View reviewed changes

packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py Outdated Show resolved Hide resolved

przemekboruta requested a review from a team as a code owner March 25, 2026 15:23

nabinchha requested changes Mar 25, 2026

View reviewed changes

nabinchha reviewed Mar 25, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/clients/retry.py Outdated Show resolved Hide resolved

przemekboruta and others added 3 commits March 25, 2026 21:18

przemekboruta force-pushed the fix/connection-pool-max-parallel-requests branch from de90ceb to 088e037 Compare March 25, 2026 20:18

przemekboruta requested a review from nabinchha March 25, 2026 20:19

greptile-apps bot reviewed Mar 25, 2026

View reviewed changes

nabinchha requested changes Mar 30, 2026

View reviewed changes

Merge branch 'main' into fix/connection-pool-max-parallel-requests

84bfb45

nabinchha approved these changes Mar 31, 2026

View reviewed changes

Merge branch 'main' into fix/connection-pool-max-parallel-requests

3d83c34

nabinchha merged commit 9711986 into NVIDIA-NeMo:main Mar 31, 2026
47 checks passed

nabinchha mentioned this pull request Mar 31, 2026

test: add transport-wiring regression tests for #459 #485

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: respect max_parallel_requests in HTTP connection pool size#460

fix: respect max_parallel_requests in HTTP connection pool size#460
nabinchha merged 5 commits intoNVIDIA-NeMo:mainfrom
przemekboruta:fix/connection-pool-max-parallel-requests

przemekboruta commented Mar 25, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 25, 2026 •

edited

Loading

Confidence Score: 5/5

Sequence Diagram

Uh oh!

Uh oh!

nabinchha left a comment

Uh oh!

Uh oh!

github-actions bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

przemekboruta commented Mar 25, 2026

Uh oh!

greptile-apps bot Mar 25, 2026

Uh oh!

nabinchha Mar 30, 2026

Uh oh!

nabinchha left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

przemekboruta commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Test plan

Uh oh!

greptile-apps bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

nabinchha left a comment

Choose a reason for hiding this comment

Critical — Must fix before merge

What Looks Good

Verdict

Uh oh!

Uh oh!

github-actions bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

przemekboruta commented Mar 25, 2026

Uh oh!

greptile-apps bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

przemekboruta commented Mar 25, 2026 •

edited

Loading

greptile-apps bot commented Mar 25, 2026 •

edited

Loading

github-actions bot commented Mar 25, 2026 •

edited

Loading