feat: generate response length based on a histogram when max_tokens is defined in the request #169

mayabar · 2025-08-25T11:10:35Z

Added logic to randomly select the response length based on a pre-defined histogram when the request includes the max_tokens property
This initial version applies the same histogram regardless of the specific max_tokens value
Future improvements will introduce dynamic buckets adapting the histogram according to different max_tokens values.

…sed on a histogram - intial implementation Signed-off-by: Maya Barnea <[email protected]>

…t be randomly selected, instead it will be stop when response length is maxTokens, otherwise - stop - fix utils_tests Signed-off-by: Maya Barnea <[email protected]>

pkg/common/utils.go

Signed-off-by: Maya Barnea <[email protected]>

shmuelk

I suggest adding more comments to the function getResponseLengthByHistogram

Signed-off-by: Maya Barnea <[email protected]>

shmuelk · 2025-08-28T07:59:54Z

/lgtm

/approve

when request contains max tokens, calculate length on the response ba…

6a8116b

…sed on a histogram - intial implementation Signed-off-by: Maya Barnea <[email protected]>

mayabar requested review from irar2 and shmuelk August 25, 2025 11:10

- in case max_tokens is defined in the request, finish reason will no…

e8b0cbb

…t be randomly selected, instead it will be stop when response length is maxTokens, otherwise - stop - fix utils_tests Signed-off-by: Maya Barnea <[email protected]>

shmuelk reviewed Aug 26, 2025

View reviewed changes

pkg/common/utils.go Outdated Show resolved Hide resolved

rename variables + fix problem by PR comment

1517f28

Signed-off-by: Maya Barnea <[email protected]>

mayabar requested a review from shmuelk August 26, 2025 08:01

shmuelk requested changes Aug 26, 2025

View reviewed changes

add more explanations to getResponseLengthByHistogram

ea7fa7b

Signed-off-by: Maya Barnea <[email protected]>

mayabar requested a review from shmuelk August 27, 2025 14:42

mayabar added 2 commits August 28, 2025 09:46

fix misspelling

94eda74

Signed-off-by: Maya Barnea <[email protected]>

fixes in comments

6c14aa6

Signed-off-by: Maya Barnea <[email protected]>

shmuelk approved these changes Aug 28, 2025

View reviewed changes

github-actions bot added the lgtm label Aug 28, 2025

github-actions bot approved these changes Aug 28, 2025

View reviewed changes

mayabar merged commit b98882a into llm-d:main Aug 28, 2025
4 checks passed

mayabar deleted the max_tokens branch August 28, 2025 12:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: generate response length based on a histogram when max_tokens is defined in the request #169

feat: generate response length based on a histogram when max_tokens is defined in the request #169

Uh oh!

mayabar commented Aug 25, 2025

Uh oh!

Uh oh!

shmuelk left a comment

Uh oh!

shmuelk commented Aug 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: generate response length based on a histogram when max_tokens is defined in the request #169

feat: generate response length based on a histogram when max_tokens is defined in the request #169

Uh oh!

Conversation

mayabar commented Aug 25, 2025

Uh oh!

Uh oh!

shmuelk left a comment

Choose a reason for hiding this comment

Uh oh!

shmuelk commented Aug 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants