chore(http-client): cleanup types, improve coverage, remove orjson by viraatc · Pull Request #157 · mlcommons/endpoints

viraatc · 2026-03-09T21:22:15Z

What does this PR do?

Type of change

Bug fix
New feature
Documentation update
Refactor/cleanup

Related issues

Testing

Tests added/updated
All tests pass locally
Manual testing completed

Checklist

Code follows project style
Pre-commit hooks pass
Documentation updated (if needed)

github-actions · 2026-03-09T21:22:25Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist · 2026-03-09T21:24:00Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on a significant refactor of the HTTP client, aiming to streamline its interface and improve internal consistency. A major part of this effort involved migrating from the orjson library to msgspec.json for all JSON handling, which contributes to a more unified and potentially performant serialization strategy. Additionally, the client's type annotations have been cleaned up, and its synchronous operational methods (poll, drain, shutdown) have received dedicated test coverage, ensuring a more reliable and easier-to-use component.

Highlights

Dependency Migration: The orjson dependency has been completely removed from the project and all its usages have been migrated to msgspec.json for JSON serialization and deserialization.
HTTP Client Refactoring: The AsyncHttpEndpointClient has been renamed to HTTPEndpointClient and refactored to provide a unified, synchronous interface for shutdown, poll, and drain operations, simplifying its usage.
Type Hint and Assertion Cleanup: Numerous type hints across config.py, http.py, and worker.py have been refined, and redundant assert statements have been removed, improving code clarity and type safety.
Improved Test Coverage: New integration tests have been added for the HTTPEndpointClient's poll, recv, drain, and shutdown methods, significantly enhancing the client's test coverage and robustness.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

pyproject.toml
- Removed 'orjson' from project dependencies.
src/inference_endpoint/endpoint_client/init.py
- Removed 'AsyncHttpEndpointClient' from the public exports.
- Updated module docstring to reflect the unified HTTP client implementation.
src/inference_endpoint/endpoint_client/config.py
- Updated type hints for 'adapter', 'accumulator', and 'worker_pool_transport' to remove 'None' and add 'type: ignore[assignment]' for default initialization.
src/inference_endpoint/endpoint_client/http.py
- Made the 'ConnectionPool.release' method idempotent to prevent errors on multiple calls.
- Updated the type hint for 'InFlightRequest.connection' to be non-optional, with assignment ignored.
src/inference_endpoint/endpoint_client/http_client.py
- Renamed 'AsyncHttpEndpointClient' to 'HTTPEndpointClient' to reflect its unified nature.
- Removed the separate 'HTTPEndpointClient' class, integrating its functionality into the main client.
- Updated the usage example in the docstring to reflect the new synchronous methods.
- Refactored the 'shutdown' method to be synchronous, with an internal asynchronous helper '_shutdown_async'.
- Optimized the 'drain' method to use 'list(iter(self.poll, None))' for efficiency.
src/inference_endpoint/endpoint_client/http_sample_issuer.py
- Removed an unnecessary 'assert self.http_client.loop is not None' statement.
src/inference_endpoint/endpoint_client/worker.py
- Updated type hints for internal HTTP and IPC components to be non-optional, with assignment ignored.
- Removed several redundant 'assert' statements where types are guaranteed by initialization.
- Added an explicit assertion message for 'event_logs_dir' when 'record_worker_events' is enabled.
src/inference_endpoint/evaluation/scoring.py
- Replaced all imports and usages of 'orjson' with 'msgspec.json' for JSON operations.
- Updated exception handling to catch 'msgspec.DecodeError' instead of 'orjson.JSONDecodeError'.
src/inference_endpoint/load_generator/session.py
- Replaced all imports and usages of 'orjson' with 'msgspec.json' for JSON operations.
- Updated JSON formatting for runtime settings and sample UUID map to use 'msgspec.json.format' and 'msgspec.json.encode'.
src/inference_endpoint/metrics/recorder.py
- Replaced all imports and usages of 'orjson' with 'msgspec.json' for JSON operations.
- Updated exception handling to catch 'msgspec.EncodeError' instead of 'orjson.JSONEncodeError'.
src/inference_endpoint/metrics/reporter.py
- Replaced all imports and usages of 'orjson' with 'msgspec.json' for JSON operations.
- Updated comments to reflect the change from 'orjson' to 'msgspec'.
- Modified JSON serialization for reports to use 'msgspec.json.format' and 'msgspec.json.encode'.
- Updated exception handling to catch 'msgspec.DecodeError' instead of 'orjson.JSONDecodeError'.
src/inference_endpoint/openai/openai_adapter.py
- Removed 'orjson' import.
- Replaced 'orjson.dumps' with 'msgspec.json.encode' for request encoding.
- Replaced 'orjson.loads' with 'msgspec.json.decode' for response decoding.
src/inference_endpoint/utils/benchmark_httpclient.py
- Updated imports to use 'HTTPEndpointClient' instead of 'AsyncHttpEndpointClient'.
- Modified client creation to use the new 'HTTPEndpointClient'.
- Changed client shutdown call to use the synchronous 'client.shutdown()' method.
tests/conftest.py
- Replaced all imports and usages of 'orjson' with 'msgspec.json' for JSON operations.
tests/integration/endpoint_client/test_http_client.py
- Added helper functions '_create_client' and '_make_query' for test setup.
- Introduced a new pytest fixture 'http_client' for consistent client instantiation.
- Added 'TestPoll' class with tests for non-blocking 'poll()' behavior.
- Added 'TestRecv' class with asynchronous tests for blocking 'recv()' behavior.
- Added 'TestDrain' class with tests for non-blocking 'drain()' behavior.
- Added 'TestShutdown' class with tests for client shutdown idempotency and post-shutdown behavior.
tests/unit/metrics/test_recorder.py
- Replaced all imports and usages of 'orjson' with 'msgspec.json' for JSON operations.
tests/unit/metrics/test_reporter.py
- Replaced all imports and usages of 'orjson' with 'msgspec.json' for JSON operations.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull request overview

This PR removes the orjson dependency and migrates JSON encoding/decoding to msgspec, while also simplifying the HTTP endpoint client API (consolidating the async/sync client split) and expanding integration coverage for polling/receiving/draining responses.

Changes:

Replace orjson usage with msgspec.json across metrics, evaluation, loadgen reporting, and tests.
Refactor the HTTP endpoint client to a single HTTPEndpointClient and update benchmarks/exports accordingly.
Add integration tests covering poll(), recv(), drain(), and shutdown() behavior.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/unit/metrics/test_reporter.py	Switch test event payload encoding from `orjson` to `msgspec.json`.
tests/unit/metrics/test_recorder.py	Update unit tests to use `msgspec.json.encode/decode` for event row data.
tests/integration/endpoint_client/test_http_client.py	Add integration tests for `poll/recv/drain/shutdown` and client fixture helpers.
tests/conftest.py	Update test DB fixtures to encode JSON via `msgspec.json`.
src/inference_endpoint/utils/benchmark_httpclient.py	Update benchmark to use `HTTPEndpointClient` and sync shutdown.
src/inference_endpoint/openai/openai_adapter.py	Migrate request/response JSON encode/decode to `msgspec`.
src/inference_endpoint/metrics/reporter.py	Replace JSON dump/load operations with `msgspec.json` formatting/decoding.
src/inference_endpoint/metrics/recorder.py	Use `msgspec.json.encode` and `msgspec.EncodeError` in event recording.
src/inference_endpoint/load_generator/session.py	Write JSON artifacts using `msgspec` formatting/encoding.
src/inference_endpoint/evaluation/scoring.py	Replace JSON parsing/serialization with `msgspec.json` and update error handling.
src/inference_endpoint/endpoint_client/worker.py	Simplify request/response handling and rely on idempotent pool release.
src/inference_endpoint/endpoint_client/http_sample_issuer.py	Remove redundant loop assertion when scheduling response handler.
src/inference_endpoint/endpoint_client/http_client.py	Consolidate client class and make shutdown synchronous via internal async helper.
src/inference_endpoint/endpoint_client/http.py	Make `ConnectionPool.release()` explicitly idempotent; adjust `InFlightRequest` typing.
src/inference_endpoint/endpoint_client/config.py	Make adapter/accumulator/transport “defaulted in post_init” (typing adjustments).
src/inference_endpoint/endpoint_client/init.py	Update exports to only expose `HTTPEndpointClient`.
pyproject.toml	Remove `orjson` dependency (keep `msgspec`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-09T21:27:39Z

src/inference_endpoint/endpoint_client/http_client.py


    Usage:
        with ManagedZMQContext.scoped() as zmq_ctx:
-            client = AsyncHttpEndpointClient(config, zmq_context=zmq_ctx)
+            client = HTTPEndpointClient(config, zmq_context=zmq_ctx)
            client.issue(query)
-            response = await client.recv()
-            await client.shutdown()
+            response = client.poll()        # Non-blocking, returns None if nothing ready
+            responses = client.drain()      # Drain all available responses
+            # response = await client.recv()  # Blocking; only if caller provides its own loop
+            client.shutdown()               # Blocks until workers stop
    """


Now that AsyncHttpEndpointClient/HTTPEndpointClient have been collapsed into a single class that typically owns a background event-loop thread (LoopManager.create_loop for non-"default" loops), the public issue() method needs to be thread-safe. With ZmqWorkerPoolTransport, send() may call loop.add_writer() via _ZmqSenderTransport, which must run on the event loop thread; calling it directly from the caller thread can sporadically raise or misbehave under backpressure.

Make issue() schedule the actual pool.send(next(self._worker_cycle), query) onto self.loop via call_soon_threadsafe (and advance the cycle on that same thread).

drop orjson

c05b75a

Copilot AI review requested due to automatic review settings March 9, 2026 21:22

viraatc requested a review from a team as a code owner March 9, 2026 21:22

github-actions bot requested review from arekay-nv and nvzhihanj March 9, 2026 21:22

Copilot started reviewing on behalf of viraatc March 9, 2026 21:22 View session

viraatc mentioned this pull request Mar 9, 2026

chore(http-client): cleanup types, improve coverage, remove orjson #121

Closed

10 tasks

viraatc changed the title ~~drop orjson~~ chore(http-client): cleanup types, improve coverage, remove orjson Mar 9, 2026

Copilot AI reviewed Mar 9, 2026

View reviewed changes

viraatc closed this Mar 9, 2026

github-actions bot locked and limited conversation to collaborators Mar 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(http-client): cleanup types, improve coverage, remove orjson#157

chore(http-client): cleanup types, improve coverage, remove orjson#157
viraatc wants to merge 1 commit intomainfrom
feat/viraatc-cleanup-asserts

viraatc commented Mar 9, 2026

Uh oh!

github-actions bot commented Mar 9, 2026

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

viraatc commented Mar 9, 2026

What does this PR do?

Type of change

Related issues

Testing

Checklist

Uh oh!

github-actions bot commented Mar 9, 2026

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants