Skip to content

fix: respect max_parallel_requests in HTTP connection pool size#460

Merged
nabinchha merged 5 commits intoNVIDIA-NeMo:mainfrom
przemekboruta:fix/connection-pool-max-parallel-requests
Mar 31, 2026
Merged

fix: respect max_parallel_requests in HTTP connection pool size#460
nabinchha merged 5 commits intoNVIDIA-NeMo:mainfrom
przemekboruta:fix/connection-pool-max-parallel-requests

Conversation

@przemekboruta
Copy link
Copy Markdown
Contributor

@przemekboruta przemekboruta commented Mar 25, 2026

Summary

Fixes #459

  • Pass a pre-configured httpx.HTTPTransport / httpx.AsyncHTTPTransport with the correct limits into RetryTransport instead of letting it create its own pool with httpx defaults (100 connections)
  • Add transport parameter to create_retry_transport and forward it to RetryTransport
  • Remove the now-ineffective limits= argument from httpx.Client / httpx.AsyncClient constructors (httpx silently ignores limits when a custom transport is provided)

Root cause

self._limits was calculated correctly from max_parallel_requests but passed to httpx.Client(limits=...), which silently ignores it when a custom transport= is provided. RetryTransport was then creating its own internal HTTPTransport with httpx defaults (max_connections=100), regardless of the configured value.

Test plan

  • test_create_retry_transport_forwards_sync_transport — provided HTTPTransport is stored as _sync_transport
  • test_create_retry_transport_forwards_async_transport — provided AsyncHTTPTransport is stored as _async_transport
  • test_create_retry_transport_no_transport_creates_defaults — default behaviour unchanged when no transport is passed
  • test_sync_client_pool_size_respects_max_parallel_requests — regression: pool has 600 connections for max_parallel_requests=300
  • test_async_client_pool_size_respects_max_parallel_requests — same for async client

🤖 Generated with Claude Code

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 25, 2026

Greptile Summary

This PR correctly fixes the root cause of issue #459: httpx.Client(limits=...) silently ignores limits when a custom transport= is provided, so the connection pool was always sized at httpx's default of 100 regardless of max_parallel_requests.

The fix pre-constructs a properly sized httpx.HTTPTransport(limits=self._limits) (for sync) or httpx.AsyncHTTPTransport(limits=self._limits) (for async) and passes it directly to RetryTransport. Based on RetryTransport.__init__'s behaviour — it stores the incoming transport into _sync_transport or _async_transport via isinstance check, and sets the other to None — both paths are wired correctly. The ClientConcurrencyMode guard in HttpModelClient ensures the None-transport side is never exercised.

Key changes:

  • http_model_client.py: Builds a typed inner transport before calling create_retry_transport, removes the now-redundant limits= kwarg from httpx.Client/httpx.AsyncClient.
  • retry.py: Adds transport parameter to create_retry_transport and forwards it directly to RetryTransport(transport=transport, ...).
  • test_native_http_clients.py: Adds limits property to HttpModelClient and one new unit test verifying the _limits object is computed correctly from max_parallel_requests.

Note on test coverage: The four regression tests listed in the PR description remain unchecked and unimplemented. The single test added only verifies that self._limits is computed correctly — a calculation that was already correct before this PR. test_retry.py also has no coverage for the new transport= parameter of create_retry_transport. While the fix itself is logically correct, the regression safety net is weak.

Confidence Score: 5/5

Fix is logically correct and safe to merge; no runtime regressions introduced.

The implementation correctly constructs a typed inner transport (HTTPTransport or AsyncHTTPTransport) with the right limits before wrapping it in RetryTransport. RetryTransport's isinstance-based dispatch stores each transport in the right slot (_sync_transport / _async_transport), and the concurrency-mode guard prevents cross-mode usage. No new P0/P1 issues were found. The only gap — that the single new test verifies _limits computation rather than actual pool size — was already raised and discussed in prior review threads, and does not represent a defect in the production code.

test_native_http_clients.py — the transport-wiring regression tests listed in the PR description remain unimplemented.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py Core fix: constructs typed inner transports with correct limits before creating RetryTransport; adds limits property exposing _limits; removes ineffective limits= kwarg from httpx client constructors.
packages/data-designer-engine/src/data_designer/engine/models/clients/retry.py Adds transport keyword-only parameter to create_retry_transport and forwards it to RetryTransport; uses TYPE_CHECKING guard to avoid a heavy import at runtime; logic is correct.
packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py Adds one test that verifies client.limits.max_connections equals the expected value — but _limits was already computed correctly pre-fix, so this does not regression-guard the actual transport-pool wiring; the four listed regression tests remain unimplemented.

Sequence Diagram

sequenceDiagram
    participant C as HttpModelClient
    participant IT as httpx.HTTPTransport (inner)
    participant RT as RetryTransport
    participant HC as httpx.Client

    Note over C: _get_sync_client() first call
    C->>IT: HTTPTransport(limits=self._limits)
    Note over IT: _pool._max_connections = 2x max_parallel_requests
    C->>RT: create_retry_transport(config, transport=inner)
    Note over RT: _sync_transport = inner
    C->>HC: httpx.Client(transport=RT, timeout=...)
    HC-->>C: client

    Note over C: _get_async_client() first call
    C->>IT: AsyncHTTPTransport(limits=self._limits)
    Note over IT: _pool._max_connections = 2x max_parallel_requests
    C->>RT: create_retry_transport(config, transport=inner)
    Note over RT: _async_transport = inner
    C->>HC: AsyncClient(transport=RT, timeout=...)
    HC-->>C: async_client
Loading

Reviews (5): Last reviewed commit: "Merge branch 'main' into fix/connection-..." | Re-trigger Greptile

przemekboruta added a commit to przemekboruta/DataDesigner that referenced this pull request Mar 25, 2026
Address Greptile P2 review comments on PR NVIDIA-NeMo#460:
- Add docstring entry for the new `transport` parameter in
  `create_retry_transport` explaining accepted types and None default
- Add inline comments in pool-size regression tests explaining the
  private attribute chain (_sync/_async_transport → _pool → _max_connections)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@przemekboruta przemekboruta requested a review from a team as a code owner March 25, 2026 15:23
Copy link
Copy Markdown
Contributor

@nabinchha nabinchha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @przemekboruta — great catch, and thanks for opening both the issue and the PR! Please see findings below that need to be addressed before we can merge.

Critical — Must fix before merge

test_retry.py:87-109 — Tests reach into third-party private attributes; import httpx inside function bodies

  • What: The three new tests (test_create_retry_transport_forwards_sync_transport, test_create_retry_transport_forwards_async_transport, test_create_retry_transport_no_transport_creates_defaults) assert on _sync_transport and _async_transport — private attributes of RetryTransport, a third-party class from httpx_retries. They also import httpx inside function bodies, violating the project's module-level import standard.
  • Why: Tests should exercise public interfaces, not private internals of third-party libraries (AGENTS.md: "Tests should exercise public interfaces, not _-prefixed functions or classes"). These tests verify a one-line pass-through (RetryTransport(transport=transport, retry=retry)) — pure plumbing. The forwarding contract is already proven end-to-end by the pool-size regression tests in test_native_http_clients.py.
  • Suggestion: Drop all three tests. They only verify plumbing via private attributes of a third-party library. The pool-size regression tests (once reworked per the next finding) already cover the end-to-end contract.

test_native_http_clients.py:299-337 — Tests rely on private methods and 4-layer private attribute chains

  • What: The regression tests call _get_sync_client() / _get_async_client() (private), then walk _transport._sync_transport._pool._max_connections — four private attributes spanning three libraries (data_designer, httpx_retries, httpcore).
  • Why: Any internal rename in httpx_retries or httpcore silently breaks these tests without indicating a real regression. Tests should exercise public interfaces, not reach into private internals.
  • Suggestion: Three changes:
    1. Expose limits as a public read-only property on HttpModelClient — it's already computed and stored, just behind a _ prefix. This is useful beyond tests (logging, diagnostics).
    2. Test the contract, not the internals. The regression test should verify that HttpModelClient computes the right limits and passes them to the inner transport:
      • client.limits.max_connections == 600 (computation is correct)
      • Transport forwarding is already covered by the RetryTransport(transport=transport, ...) call — no separate assertion needed.
    3. Don't call private init methods from tests. If the test needs to trigger lazy initialization, test through the public completion() / acompletion() methods with a mocked httpx client (the existing lazy-init tests already do this).

What Looks Good

  • Root cause analysis is precise. The three-layer problem (limits computed correctly, passed to wrong place, RetryTransport creates default pool) is correctly identified and the fix targets the right layer.
  • The fix is minimal and surgical. Works with httpx_retries's existing transport parameter rather than fighting the library. Removing the dead limits= kwarg is the right cleanup.
  • The production code changes are clean. Both retry.py and http_model_client.py changes are correct and well-structured.

Verdict

Needs changes — The core fix is correct and well-understood. The test changes need rework: (1) drop the three transport-forwarding tests in test_retry.py that reach into third-party private attributes, and (2) rework the pool-size regression tests in test_native_http_clients.py to assert on a public limits property rather than walking private internals across three libraries.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

All contributors have signed the DCO ✍️ ✅
Posted by the DCO Assistant Lite bot.

@przemekboruta
Copy link
Copy Markdown
Contributor Author

I have read the DCO document and I hereby sign the DCO.

przemekboruta and others added 3 commits March 25, 2026 21:18
Pass a pre-configured HTTPTransport/AsyncHTTPTransport with the correct
limits into RetryTransport instead of letting it create its own pool
with httpx defaults (100 connections). Previously, the limits calculated
from max_parallel_requests were passed to httpx.Client(limits=...) which
silently ignores them when a custom transport is provided.

Fixes NVIDIA-NeMo#459

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Przemysław <przemekboruta@interia.pl>
Address Greptile P2 review comments on PR NVIDIA-NeMo#460:
- Add docstring entry for the new `transport` parameter in
  `create_retry_transport` explaining accepted types and None default
- Add inline comments in pool-size regression tests explaining the
  private attribute chain (_sync/_async_transport → _pool → _max_connections)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Przemysław <przemekboruta@interia.pl>
- Drop three tests from test_retry.py that reached into private
  attributes of third-party RetryTransport (_sync_transport,
  _async_transport); the end-to-end contract is covered by the
  pool-size regression test
- Expose a public `limits` property on HttpModelClient so tests and
  diagnostic code can assert the pool configuration without walking
  private attribute chains across three libraries
- Replace two private-chain pool assertions with a single
  `client.limits.max_connections == 600` check against the new property
- Trim "inner" from the transport docstring entry (nabinchha suggestion)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Przemysław <przemekboruta@interia.pl>
@przemekboruta przemekboruta force-pushed the fix/connection-pool-max-parallel-requests branch from de90ceb to 088e037 Compare March 25, 2026 20:18
@przemekboruta przemekboruta requested a review from nabinchha March 25, 2026 20:19
Comment on lines +299 to +311
def test_client_limits_respect_max_parallel_requests() -> None:
"""Connection pool limits must reflect max_parallel_requests (regression for issue #459).

pool_max = max(32, 2 * max_parallel_requests) = max(32, 600) = 600
"""
client = OpenAICompatibleClient(
provider_name=_OPENAI_PROVIDER,
endpoint=_OPENAI_ENDPOINT,
api_key="sk-test",
max_parallel_requests=300,
concurrency_mode=ClientConcurrencyMode.SYNC,
)
assert client.limits.max_connections == 600
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Regression test doesn't verify the actual fix

client.limits.max_connections reads self._limits — a value that was computed correctly even before this PR. The bug was that _limits was never forwarded to the underlying httpcore connection pool. This test passes on the old (broken) code just as easily as on the new code, so it provides no regression protection.

To actually verify the fix, the test needs to trigger lazy transport initialization and then inspect the pool's _max_connections. The PR description lists exactly these missing tests (all four checkboxes are unchecked):

  • test_sync_client_pool_size_respects_max_parallel_requests
  • test_async_client_pool_size_respects_max_parallel_requests
  • test_create_retry_transport_forwards_sync_transport
  • test_create_retry_transport_forwards_async_transport

The existing test is fine as a unit-level sanity check for the limits property, but it should not be the sole regression guard for this bug.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py
Line: 299-311

Comment:
**Regression test doesn't verify the actual fix**

`client.limits.max_connections` reads `self._limits` — a value that was computed correctly even *before* this PR. The bug was that `_limits` was never forwarded to the underlying `httpcore` connection pool. This test passes on the old (broken) code just as easily as on the new code, so it provides no regression protection.

To actually verify the fix, the test needs to trigger lazy transport initialization and then inspect the pool's `_max_connections`. The PR description lists exactly these missing tests (all four checkboxes are unchecked):
- `test_sync_client_pool_size_respects_max_parallel_requests`
- `test_async_client_pool_size_respects_max_parallel_requests`
- `test_create_retry_transport_forwards_sync_transport`
- `test_create_retry_transport_forwards_async_transport`

The existing test is fine as a unit-level sanity check for the `limits` property, but it should not be the sole regression guard for this bug.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — greptile is right that this test would pass on the pre-fix code since _limits was always computed correctly.

However, I think the right path here is a mock-capture approach that verifies the wiring without touching any private API.

The fix works by explicitly constructing httpx.HTTPTransport(limits=<correct limits>) and passing it into RetryTransport — something the old code never did. So we can patch the constructor and assert it received the right limits:

@patch("data_designer.engine.models.clients.adapters.http_model_client.lazy.httpx.HTTPTransport")
@patch(_SYNC_CLIENT_PATCH)
def test_sync_pool_limits_forwarded_to_transport(
    mock_client_cls: MagicMock,
    mock_transport_cls: MagicMock,
) -> None:
    """Regression for #459: limits must reach HTTPTransport, not just httpx.Client."""
    mock_client_cls.return_value = MagicMock(
        post=MagicMock(return_value=mock_httpx_response(_make_openai_chat_response()))
    )
    client = OpenAICompatibleClient(
        provider_name=_OPENAI_PROVIDER,
        endpoint=_OPENAI_ENDPOINT,
        api_key="sk-test",
        max_parallel_requests=300,
        concurrency_mode=ClientConcurrencyMode.SYNC,
    )
    client.completion(_make_chat_request(_OPENAI_MODEL))

    mock_transport_cls.assert_called_once()
    limits = mock_transport_cls.call_args.kwargs["limits"]
    assert limits.max_connections == 600
    assert limits.max_keepalive_connections == 300

This fails on the pre-fix code (assert_called_once() fails because the old code never constructs HTTPTransportRetryTransport() creates a default one internally) and passes on the fixed code, with zero private attributes.

Same pattern works for async by patching AsyncHTTPTransport and triggering via acompletion(). I'd keep the existing client.limits test as a unit check for the computation and add this as the actual regression guard.

Copy link
Copy Markdown
Contributor

@nabinchha nabinchha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@przemekboruta thanks for the PR! In the interest of getting the fix merged asap let's merge this. I can update the unit tests in a separat PR!

@nabinchha nabinchha merged commit 9711986 into NVIDIA-NeMo:main Mar 31, 2026
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

max_parallel_requests has no effect on actual HTTP connection pool size

2 participants