Skip to content

Conversation

jaredoconnell
Copy link
Collaborator

Summary

The output formats changed, and therefore the existing tests broke.
Keep in mind that some of the tested code may be replaced in the near future, as a new console output is in the list of things for 0.5.0.

I ran into issues with duplicate fields on the mock objects for all computed fields. Those tests are skipped. Let me know if you have a plan on how to fix this.

This PR will be in a draft state until the CSV changes are merged.

Test Plan

  • Run the tests with pytest.
  • Run the tox type lints.

  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

@jaredoconnell jaredoconnell force-pushed the features/refactor-fix-output-tests branch from 22776a8 to 32e4909 Compare October 1, 2025 18:35
@jaredoconnell
Copy link
Collaborator Author

Here is an example of an error that I'm running into:
test_generative_benchmark_marshalling:

>           assert getattr(mock_benchmark, field) == getattr(deserialized_benchmark, field)
E           AssertionError: assert StatusBreakdown[list[GenerativeRequestStats], list[GenerativeRequestStats], list[GenerativeRequestStats], NoneType](successful=[GenerativeRequestStats(scheduler_info=ScheduledRequestInfo(request_id='c19ef5bd-84e0-44a0-af2d-a16e97208379', status='queued', scheduler_node_id=-1, scheduler_process_id=-1, scheduler_start_time=-1.0, error=None, scheduler_timings=RequestSchedulerTimings(targeted_start=None, queued=None, dequeued=None, scheduled_at=None, resolve_start=None, resolve_end=None, finalized=None), request_timings=GenerationRequestTimings(timings_type='generation_request_timings', request_start=1.0, request_end=6.0, first_iteration=None, last_iteration=None), started_at=1.0, completed_at=6.0), type_='generative_request_stats', request_id='a', request_type='text_completions', prompt='p', request_args={}, output='o', iterations=1, prompt_tokens=1, output_tokens=2, total_tokens=3, request_latency=5.0, time_to_first_token_ms=None, time_per_output_token_ms=None, inter_token_latency_ms=None, tokens_per_second=0.6, output_tokens_per_second=0.4)], errored=[], incomplete=[], total=None) == StatusBreakdown[list[GenerativeRequestStats], list[GenerativeRequestStats], list[GenerativeRequestStats], NoneType](successful=[GenerativeRequestStats(scheduler_info=ScheduledRequestInfo(request_id='c19ef5bd-84e0-44a0-af2d-a16e97208379', status='queued', scheduler_node_id=-1, scheduler_process_id=-1, scheduler_start_time=-1.0, error=None, scheduler_timings=RequestSchedulerTimings(targeted_start=None, queued=None, dequeued=None, scheduled_at=None, resolve_start=None, resolve_end=None, finalized=None), request_timings=GenerationRequestTimings(timings_type='generation_request_timings', request_start=1.0, request_end=6.0, first_iteration=None, last_iteration=None), started_at=1.0, completed_at=6.0), type_='generative_request_stats', request_id='a', request_type='text_completions', prompt='p', request_args={}, output='o', iterations=1, prompt_tokens=1, output_tokens=2, total_tokens=3, request_latency=5.0, time_to_first_token_ms=None, time_per_output_token_ms=None, inter_token_latency_ms=None, tokens_per_second=0.6, output_tokens_per_second=0.4, total_tokens=3, request_latency=5.0, time_to_first_token_ms=None, time_per_output_token_ms=None, inter_token_latency_ms=None, tokens_per_second=0.6, output_tokens_per_second=0.4)], errored=[], incomplete=[], total=None)
E             
E             Full diff:
E             - StatusBreakdown[list[GenerativeRequestStats], list[GenerativeRequestStats], list[GenerativeRequestStats], NoneType](successful=[GenerativeRequestStats(scheduler_info=ScheduledRequestInfo(request_id='c19ef5bd-84e0-44a0-af2d-a16e97208379', status='queued', scheduler_node_id=-1, scheduler_process_id=-1, scheduler_start_time=-1.0, error=None, scheduler_timings=RequestSchedulerTimings(targeted_start=None, queued=None, dequeued=None, scheduled_at=None, resolve_start=None, resolve_end=None, finalized=None), request_timings=GenerationRequestTimings(timings_type='generation_request_timings', request_start=1.0, request_end=6.0, first_iteration=None, last_iteration=None), started_at=1.0, completed_at=6.0), type_='generative_request_stats', request_id='a', request_type='text_completions', prompt='p', request_args={}, output='o', iterations=1, prompt_tokens=1, output_tokens=2, total_tokens=3, request_latency=5.0, time_to_first_token_ms=None, time_per_output_token_ms=None, inter_token_latency_ms=None, tokens_per_second=0.6, output_tokens_per_second=0.4, total_tokens=3, request_latency=5.0, time_to_first_token_ms=None, time_per_output_token_ms=None, inter_token_latency_ms=None, tokens_per_second=0.6, output_tokens_per_second=0.4)], errored=[], incomplete=[], total=None)
E             ?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
E             + StatusBreakdown[list[GenerativeRequestStats], list[GenerativeRequestStats], list[GenerativeRequestStats], NoneType](successful=[GenerativeRequestStats(scheduler_info=ScheduledRequestInfo(request_id='c19ef5bd-84e0-44a0-af2d-a16e97208379', status='queued', scheduler_node_id=-1, scheduler_process_id=-1, scheduler_start_time=-1.0, error=None, scheduler_timings=RequestSchedulerTimings(targeted_start=None, queued=None, dequeued=None, scheduled_at=None, resolve_start=None, resolve_end=None, finalized=None), request_timings=GenerationRequestTimings(timings_type='generation_request_timings', request_start=1.0, request_end=6.0, first_iteration=None, last_iteration=None), started_at=1.0, completed_at=6.0), type_='generative_request_stats', request_id='a', request_type='text_completions', prompt='p', request_args={}, output='o', iterations=1, prompt_tokens=1, output_tokens=2, total_tokens=3, request_latency=5.0, time_to_first_token_ms=None, time_per_output_token_ms=None, inter_token_latency_ms=None, tokens_per_second=0.6, output_tokens_per_second=0.4)], errored=[], incomplete=[], total=None)

test_output.py:42: AssertionError

As you can see, computed fields are duplicated.

@jaredoconnell jaredoconnell marked this pull request as ready for review October 1, 2025 21:17
@jaredoconnell jaredoconnell force-pushed the features/refactor-fix-output-tests branch from 2da102d to ff9a61a Compare October 3, 2025 16:42
@jaredoconnell jaredoconnell merged commit 46b5e87 into vllm-project:features/refactor/base Oct 3, 2025
8 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants