logit_bias: apply configurable escalating EOG bias at low n_remain #14229

graehl · 2025-06-16T23:32:16Z

give eog an increasing (with length - per token, could be per codepoint in future) bias, only after a configured amount generated

add to sample_apply an n_remain param, which is safer than having logit_bias maintain state for how many times it's called (which would lead to wrong assumptions e.g. when calling multiple times per token).

see new command line options (incl a request 'after' instead of 'remain'):

-eog, --eog-bias-per-tok N when fewer than -start-eog-at-remain tokens are left to generate after
-n, add this bias eog for each subsequent token (default: 0.0)
-remain, --start-eog-at-remain N start applying -eog bias when this many tokens remain of the -n max
(default: 0.0)
-after, --start-eog-after N start applying -eog bias after this many tokens generated (default:
1000000000.0); whichever happens first between -remain and -after
applies

Verified that eog bias was effective at avoiding
overgeneration and is a reasonable supplement or alternative to editing the prompt; a constant eog bias, already supported in samplers, is likely to allow pathologically short outputs.

Make sure to read the contributing guidelines before submitting a PR

give eog an increasing (with length - per token, could be per codepoint in future) bias, only after a configured amount generated add to `sample_apply` an `n_remain` param, which is safer than having logit_bias maintain state for how many times it's called (which would lead to wrong assumptions e.g. when calling multiple times per token). see new command line options (incl a request 'after' instead of 'remain'): -eog, --eog-bias-per-tok N when fewer than -start-eog-at-remain tokens are left to generate after -n, add this bias eog for each subsequent token (default: 0.0) -remain, --start-eog-at-remain N start applying -eog bias when this many tokens remain of the -n max (default: 0.0) -after, --start-eog-after N start applying -eog bias after this many tokens generated (default: 1000000000.0); whichever happens first between -remain and -after applies Verified that eog bias was effective at avoiding overgeneration and is a reasonable supplement or alternative to editing the prompt; a *constant* eog bias, already supported in samplers, is likely to allow pathologically short outputs.

graehl requested a review from ngxson as a code owner June 16, 2025 23:32

github-actions bot added testing Everything test related examples server labels Jun 16, 2025

graehl force-pushed the length branch from 0cffe93 to c6d1d54 Compare July 2, 2025 19:07

graehl force-pushed the length branch from c6d1d54 to 4625fef Compare October 23, 2025 04:50

graehl requested review from JohannesGaessler, am17an, ggerganov and slaren as code owners October 23, 2025 04:50

graehl force-pushed the length branch from 4625fef to b4afe15 Compare October 23, 2025 04:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

logit_bias: apply configurable escalating EOG bias at low n_remain #14229

logit_bias: apply configurable escalating EOG bias at low n_remain #14229

Uh oh!

graehl commented Jun 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

logit_bias: apply configurable escalating EOG bias at low n_remain #14229

Are you sure you want to change the base?

logit_bias: apply configurable escalating EOG bias at low n_remain #14229

Uh oh!

Conversation

graehl commented Jun 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant