Support vllm:max_num_generation_tokens metrics #250

mayabar · 2025-11-03T07:17:27Z

Currently vllm:max_num_generation_tokens report same values as vllm:request_generation_tokens since response always contains only one choice.

Fixes #243

… we never return responses with more than one choice, the implementation is basic. Once 'n' request property will be supported - need to change to support real maximum. Added support in fake metrics. Tests added too. Signed-off-by: Maya Barnea <[email protected]>

Signed-off-by: Maya Barnea <[email protected]>

…sage and fix, fix arguments in invalid configuration tests. Fix validation of ttft and tpot fake definitions. Signed-off-by: Maya Barnea <[email protected]>

pkg/common/config_test.go

pkg/common/utils.go

pkg/llm-d-inference-sim/metrics.go

…x invalid lora test in config, add missing comments Signed-off-by: Maya Barnea <[email protected]>

Signed-off-by: Maya Barnea <[email protected]>

irar2 · 2025-11-04T09:23:36Z

/lgtm
/approve

mayabar added 2 commits November 3, 2025 09:14

update readme

254fd55

Signed-off-by: Maya Barnea <[email protected]>

mayabar requested a review from irar2 November 3, 2025 07:17

Fix and extend 'Simulator configuration' test: add expected error mes…

e6c5941

…sage and fix, fix arguments in invalid configuration tests. Fix validation of ttft and tpot fake definitions. Signed-off-by: Maya Barnea <[email protected]>

irar2 requested changes Nov 4, 2025

View reviewed changes

pkg/common/config_test.go Outdated Show resolved Hide resolved

pkg/common/config_test.go Outdated Show resolved Hide resolved

pkg/common/utils.go Show resolved Hide resolved

pkg/llm-d-inference-sim/metrics.go Outdated Show resolved Hide resolved

Change zmq-max-connect-attempts to int to get nicer error message, fi…

dccffa7

…x invalid lora test in config, add missing comments Signed-off-by: Maya Barnea <[email protected]>

mayabar requested a review from irar2 November 4, 2025 09:11

mayabar added 3 commits November 4, 2025 11:14

fix typo

5f8d15f

Signed-off-by: Maya Barnea <[email protected]>

fix typo

b9296a5

Signed-off-by: Maya Barnea <[email protected]>

fix compilation error

c816034

Signed-off-by: Maya Barnea <[email protected]>

irar2 approved these changes Nov 4, 2025

View reviewed changes

github-actions bot added the lgtm label Nov 4, 2025

github-actions bot approved these changes Nov 4, 2025

View reviewed changes

github-actions bot merged commit e1e27ea into llm-d:main Nov 4, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support vllm:max_num_generation_tokens metrics #250

Support vllm:max_num_generation_tokens metrics #250

Uh oh!

mayabar commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

irar2 commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Support vllm:max_num_generation_tokens metrics #250

Support vllm:max_num_generation_tokens metrics #250

Uh oh!

Conversation

mayabar commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

irar2 commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants