You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
logit_bias: apply configurable escalating EOG bias at low n_remain
give eog an increasing (with length - per token, could be per
codepoint in future) bias, only after a configured amount generated
add to `sample_apply` an `n_remain` param, which is safer than having
logit_bias maintain state for how many times it's called (which
would lead to wrong assumptions e.g. when calling multiple times per
token).
see new command line options (incl a request 'after' instead of
'remain'):
-eog, --eog-bias-per-tok N when fewer than -start-eog-at-remain tokens are left to generate after
-n, add this bias eog for each subsequent token (default: 0.0)
-remain, --start-eog-at-remain N start applying -eog bias when this many tokens remain of the -n max
(default: 0.0)
-after, --start-eog-after N start applying -eog bias after this many tokens generated (default:
1000000000.0); whichever happens first between -remain and -after
applies
Verified that eog bias was effective at avoiding
overgeneration and is a reasonable supplement or alternative
to editing the prompt; a *constant* eog bias, already supported
in samplers, is likely to allow pathologically short outputs.
string_format("when fewer than -start-eog-at-remain tokens are left to generate after -n, add this bias eog for each subsequent token (default: %.1f)", (double)params.sampling.eog_bias_per_tok),
string_format("start applying -eog bias after this many tokens generated (default: %.1f); whichever happens first between -remain and -after applies", (double)params.sampling.start_eog_after),
0 commit comments