Skip to content

Conversation

@mayabar
Copy link
Collaborator

@mayabar mayabar commented Oct 28, 2025

Add vllm:e2e_request_latency_seconds, vllm:request_queue_time_seconds, vllm:request_inference_time_seconds, vllm:request_prefill_time_seconds, vllm:request_decode_time_seconds metrics.
Relevant tests were added for streaming and not-streaming scenarios.
In addition, need to add tests for remote prefill/decode - issue #236

Additionally, utils functions from simulator package were moved to a separate file + more refactoring in tests.

@mayabar mayabar requested a review from irar2 October 28, 2025 10:52
Add reportHistogramValue function to be used for reporting values in histogram metrics

Signed-off-by: Maya Barnea <[email protected]>
…ference_time_seconds, vllm:request_prefill_time_seconds, and vllm:request_decode_time_seconds

Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
…me model name is all tests, refactoring in server start functions

Signed-off-by: Maya Barnea <[email protected]>
…st for histogram buckets validation

Signed-off-by: Maya Barnea <[email protected]>
- Create constants for all metrics
- Define all latency related fake metrics in config
- Add validation for new fake metrics in config

Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
@mayabar
Copy link
Collaborator Author

mayabar commented Oct 28, 2025

Fixes #222

@irar2
Copy link
Collaborator

irar2 commented Oct 29, 2025

/lgtm

@github-actions github-actions bot added the lgtm label Oct 29, 2025
@irar2
Copy link
Collaborator

irar2 commented Oct 29, 2025

/approve

@irar2 irar2 merged commit de71f5d into llm-d:main Oct 29, 2025
4 checks passed
@mayabar mayabar deleted the latency-metrics branch October 29, 2025 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants