Skip to content

Consolidate core types: move record.py, add structured output types, enforce strict typing#186

Merged
nv-alicheng merged 3 commits intomainfrom
feat/alicheng-core-types-refactor
Mar 18, 2026
Merged

Consolidate core types: move record.py, add structured output types, enforce strict typing#186
nv-alicheng merged 3 commits intomainfrom
feat/alicheng-core-types-refactor

Conversation

@nv-alicheng
Copy link
Copy Markdown
Collaborator

What does this PR do?

  • Move record.py from async_utils/transport/ to core/ — it's a core data type, not transport-specific
  • Replace raw str response output with TextModelOutput struct for type safety and downstream metric computation
  • Add PromptData and ErrorData msgspec Structs with gc=False, array_like=True for hot-path performance
  • Fix threading safety issues in HTTPEndpointClient, SampleEventHandler, EventRecorder, and shutdown_executor()
  • Bump default worker init timeout from 30s → 60s

Type of change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor/cleanup

Related issues

Testing

  • Tests added/updated
  • All tests pass locally
  • Manual testing completed

Checklist

  • Code follows project style
  • Pre-commit hooks pass
  • Documentation updated (if needed)

- Move record.py from async_utils/transport to core/
- Add PromptData, TextModelOutput, ErrorData types with msgspec Struct
- Deprecate str as response_output type in favor of TextModelOutput
- Add msgspec struct performance flags (gc=False, array_like=True)
- Fix threading safety issues in http_client, sample handler, recorder
- Update all imports across the codebase
- Bump default worker init timeout to 60s
@nv-alicheng nv-alicheng requested a review from a team as a code owner March 18, 2026 00:55
@github-actions github-actions bot requested review from arekay-nv and nvzhihanj March 18, 2026 00:55
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 18, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors core data types within the inference endpoint system to improve type safety, performance, and maintainability. By introducing dedicated msgspec.Struct types for model outputs, prompt data, and error information, the system gains clearer data contracts and optimizes serialization/deserialization. The changes also involve relocating a core module and making minor adjustments to enhance threading safety and worker stability.

Highlights

  • Core Type Consolidation: The record.py module was moved from async_utils/transport/ to core/, signifying its role as a fundamental data type.
  • Structured Output Types: Introduced TextModelOutput, PromptData, and ErrorData as msgspec.Structs with gc=False and array_like=True for enhanced type safety and performance, replacing raw string or dictionary outputs.
  • Strict Typing Enforcement: QueryResult and EventRecord now strictly use the new structured types for response_output and data fields, respectively.
  • Threading Safety Improvements: Addressed potential threading issues in SingletonMixin by introducing a lock, and noted known limitations in SampleEventHandler and EventRecorder due to their planned deprecation.
  • Increased Worker Initialization Timeout: The default worker initialization timeout was increased from 30s to 60s.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable refactoring by consolidating core data types. Moving record.py to core/ improves the project structure. The replacement of raw string outputs with the new TextModelOutput struct, along with the addition of PromptData and ErrorData, greatly enhances type safety and performance. The threading safety fix in SingletonMixin is also a welcome improvement. The changes are applied consistently across the codebase, including extensive updates to tests. I've found a couple of minor issues with incorrect comments in the SGLang adapter tests, which I've flagged for correction. Overall, this is a high-quality contribution that improves the maintainability and robustness of the system.

The SGLang adapter now returns TextModelOutput, not raw str.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@nv-alicheng nv-alicheng merged commit 98d68f8 into main Mar 18, 2026
4 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 18, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants