Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the test coverage and robustness of the HTTP client and worker components. The changes introduce a wide array of new unit and integration tests, ensuring that various edge cases, error conditions, and lifecycle events are handled correctly. This refactoring effort aims to improve the reliability and maintainability of the client's asynchronous operations and connection management. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request significantly improves the test coverage for the HTTP client and its underlying components. The new tests are very thorough, covering a wide range of functionality, edge cases, and error conditions for the ZMQ transport, HTTP protocol parsing, connection pooling, and worker lifecycle. This greatly increases confidence in the robustness of the client. I have one suggestion to improve the maintainability of a unit test, but overall this is an excellent contribution.
| async def test_is_stale_various_conditions(self, pool): | ||
| """is_stale() returns correct results for different connection states.""" | ||
| import time as time_mod | ||
|
|
||
| conn = await pool.acquire() | ||
| pool.release(conn) | ||
|
|
||
| # 1. Recently used — fast path returns False | ||
| assert not conn.is_stale() | ||
|
|
||
| # 2. Age past 1s, healthy connection — not stale | ||
| conn.last_used = time_mod.monotonic() - 2.0 | ||
| assert not conn.is_stale() | ||
|
|
||
| # 3. Transport is None — stale | ||
| saved_transport = conn.transport | ||
| conn.transport = None # type: ignore[assignment] | ||
| conn.last_used = time_mod.monotonic() - 2.0 | ||
| assert conn.is_stale() | ||
| conn.transport = saved_transport # restore | ||
|
|
||
| # 4. get_extra_info("socket") returns None — stale | ||
| conn.transport = saved_transport # ensure transport is restored | ||
| orig_get_extra = conn.transport.get_extra_info | ||
| conn.transport.get_extra_info = lambda name, default=None: None # type: ignore[assignment] | ||
| conn.last_used = time_mod.monotonic() - 2.0 | ||
| assert conn.is_stale() | ||
| conn.transport.get_extra_info = orig_get_extra # type: ignore[assignment] | ||
|
|
||
| # 5. fd < 0 — stale | ||
| from unittest.mock import MagicMock | ||
|
|
||
| mock_sock = MagicMock() | ||
| mock_sock.fileno.return_value = -1 | ||
| conn.transport = saved_transport | ||
| conn.transport.get_extra_info = ( | ||
| lambda name, default=None: mock_sock if name == "socket" else default | ||
| ) # type: ignore[assignment] | ||
| conn.last_used = time_mod.monotonic() - 2.0 | ||
| assert conn.is_stale() | ||
|
|
||
| # 6. OSError from fileno/select — stale | ||
| mock_sock2 = MagicMock() | ||
| mock_sock2.fileno.side_effect = OSError("bad fd") | ||
| conn.transport.get_extra_info = ( | ||
| lambda name, default=None: mock_sock2 if name == "socket" else default | ||
| ) # type: ignore[assignment] | ||
| conn.last_used = time_mod.monotonic() - 2.0 | ||
| assert conn.is_stale() | ||
|
|
||
| conn.transport.get_extra_info = orig_get_extra # type: ignore[assignment] | ||
|
|
There was a problem hiding this comment.
This test is very thorough in covering the different conditions for a stale connection. However, the implementation relies on manually saving and restoring mocked methods (get_extra_info) and object attributes (transport). This can be fragile, as an assertion failure could leave the object in a modified state, potentially affecting subsequent tests if fixtures were shared.
A more robust and readable approach would be to use unittest.mock.patch.object as a context manager for each condition being tested. This ensures that mocks are automatically cleaned up, even if assertions fail.
Here's an example of how case #4 could be refactored:
from unittest.mock import patch
# ... inside the test ...
# 4. get_extra_info("socket") returns None — stale
conn.last_used = time_mod.monotonic() - 2.0
with patch.object(conn.transport, 'get_extra_info', return_value=None):
assert conn.is_stale()
# Original method is restored automaticallyApplying this pattern to all mock-based checks in this test would improve its maintainability.
There was a problem hiding this comment.
Pull request overview
This PR expands the test suite around the http-client stack (HTTP parsing/pooling, ZMQ transport, worker lifecycle, and endpoint client initialization/shutdown) to move toward full coverage.
Changes:
- Add unit tests for
HttpResponseProtocolandConnectionPooledge/error paths. - Add extensive unit tests for ZMQ transport sender/receiver close and error-handling branches.
- Add/extend integration tests for worker behaviors (config variants, SSE boundaries, shutdown races) and http-client initialization/shutdown.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
tests/unit/endpoint_client/test_http.py |
Adds unit/integration tests covering HTTP protocol parsing, connection loss, and pool behaviors (warmup, stale/idle timeout, waiters). |
tests/unit/async_utils/transport/test_zmq.py |
Adds unit tests for ZMQ transport close paths, sender/receiver error handling, and factory/lifecycle behaviors. |
tests/integration/endpoint_client/test_worker.py |
Adds many worker integration tests for config variants, SSE boundaries, shutdown edge cases, warmup failure handling, and GC modes. |
tests/integration/endpoint_client/test_http_client.py |
Adds tests for missing ZMQ context, external loop usage, and shutdown behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| endpoint_urls=[f"http://localhost:{server.port}/error"], | ||
| num_workers=1, | ||
| max_connections=10, | ||
| warmup_connections=False, |
There was a problem hiding this comment.
HTTPClientConfig.warmup_connections is declared as an int (0 = disabled, -1 = auto). Passing False works only because bool is a subclass of int, but it's ambiguous and can confuse readers/type-checkers. Use warmup_connections=0 here to explicitly disable warmup.
| warmup_connections=False, | |
| warmup_connections=0, |
| endpoint_urls=[f"http://localhost:{server.port}/sse-test"], | ||
| num_workers=1, | ||
| max_connections=10, | ||
| warmup_connections=False, |
There was a problem hiding this comment.
HTTPClientConfig.warmup_connections is an int (0 disables warmup). Using False relies on bool being an int and is unclear; please change this to warmup_connections=0 for explicitness.
| warmup_connections=False, | |
| warmup_connections=0, |
| endpoint_urls=[f"http://localhost:{server.port}/slow"], | ||
| num_workers=1, | ||
| max_connections=10, | ||
| warmup_connections=False, |
There was a problem hiding this comment.
HTTPClientConfig.warmup_connections is typed/defined as an int (0 disables warmup). Using False here is ambiguous and depends on bool subclassing int; please switch to warmup_connections=0.
| warmup_connections=False, | |
| warmup_connections=0, |
| def test_shutdown_logs_dropped_requests(self, mock_http_echo_server): | ||
| """Shutdown logs count of dropped requests when > 0.""" | ||
| with ManagedZMQContext.scoped() as zmq_ctx: | ||
| client = _create_client( | ||
| f"{mock_http_echo_server.url}/v1/chat/completions", | ||
| zmq_context=zmq_ctx, | ||
| ) | ||
| client._dropped_requests = 3 | ||
| client.shutdown() |
There was a problem hiding this comment.
This test is named/documented as verifying that shutdown logs dropped requests, but it doesn't assert on any emitted log output (no caplog/logger patching). As a result it will pass even if nothing is logged. Please capture logs and assert the expected message/level when _dropped_requests > 0, or rename the test to reflect what it actually verifies.
| """set_start_method handles already-set case. | ||
|
|
||
| The module-level code catches RuntimeError when the | ||
| multiprocessing start method is already set. In test environments, it | ||
| is always already set (by pytest or another import). We verify the | ||
| module imported successfully, which means the except block executed. | ||
| """ | ||
| import multiprocessing | ||
|
|
||
| from inference_endpoint.endpoint_client import worker | ||
|
|
||
| assert worker is not None | ||
| # The start method should be set (either by the module or by pytest) |
There was a problem hiding this comment.
The docstring claims the RuntimeError "already set" path was exercised ("which means the except block executed"), but this test never forces multiprocessing.set_start_method(...) to raise. As written it only validates that the module imports and that get_start_method() is non-None, which doesn't prove the exception-handling path ran. Please either update the docstring/test name to match what is asserted, or restructure the test to deterministically hit the RuntimeError-handling branch without impacting other tests.
| """set_start_method handles already-set case. | |
| The module-level code catches RuntimeError when the | |
| multiprocessing start method is already set. In test environments, it | |
| is always already set (by pytest or another import). We verify the | |
| module imported successfully, which means the except block executed. | |
| """ | |
| import multiprocessing | |
| from inference_endpoint.endpoint_client import worker | |
| assert worker is not None | |
| # The start method should be set (either by the module or by pytest) | |
| """Worker module import succeeds and multiprocessing start method is configured. | |
| The worker module sets or validates the multiprocessing start method | |
| at import time. This test ensures that importing the module does not | |
| raise and that a start method is configured in the current environment. | |
| It does not assert which internal code path (initial set vs | |
| already-set/RuntimeError handling) was taken. | |
| """ | |
| import multiprocessing | |
| from inference_endpoint.endpoint_client import worker | |
| assert worker is not None | |
| # The start method should be set (either by the module or by the test environment) |
| pool = ZmqWorkerPoolTransport.create(event_loop, 1, zmq_ctx) | ||
|
|
||
| # Call __del__ directly -- should not raise | ||
| pool.__del__() |
| pool.cleanup = raising_cleanup | ||
|
|
||
| # __del__ should not raise | ||
| pool.__del__() |
| async def test_pool_close_cancels_waiters(self, pool): | ||
| """close() cancels all pending waiters and clears the waiter queue.""" | ||
| # Acquire all 4 connections to saturate the pool | ||
| _conns = [await pool.acquire() for _ in range(4)] |
| protocol.connection_lost(exc) | ||
|
|
||
| with pytest.raises(expected_error): | ||
| await body_task |
| protocol.data_received(b"NOT HTTP AT ALL") | ||
|
|
||
| with pytest.raises(httptools.HttpParserError): | ||
| await headers_task |
| # Feed invalid chunk framing — triggers HttpParserError on body_future | ||
| protocol.data_received(b"ZZZZ\r\nNOT A VALID CHUNK\r\n") | ||
| with pytest.raises(httptools.HttpParserError): | ||
| await body_task |
| result = await asyncio.wait_for(receiver.recv(), timeout=1.0) | ||
| assert result is None | ||
|
|
||
| await close_task |
| # Remove the reader before close, so close() will get an error | ||
| try: | ||
| loop.remove_reader(receiver._fd) | ||
| except (ValueError, OSError): |
What does this PR do?
Type of change
Related issues
Testing
Checklist