-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Note: This issue was copied from ggml-org#2923
Original Author: @ggerganov
Original Issue Number: ggml-org#2923
Created: 2023-08-31T06:29:29Z
This feature was proposed by @spion in ggml-org#2813 (comment)
In some cases, its useful to do constrained evaluation of logits based on a union of possible text values, then pick the sum { logits } (i.e. product(probabilities)) that gives the most probable outcome overall.
E.g. template (using MS guidance)
{{#select 'armor'}}leather{{or}}chainmail{{or}}plate{{/select}}
To definitely make the best choice, we'd need to calculate the probability of all 3 token sequences. Its easy if all the choices map to a single token, but with multiple tokens we'd need not just parallel generation but parallel logit evaluation of multiple possible paths.
If we go greedy, we might get suboptimal results in cases multiple choices start with the same logit.
It should be possible to implement this by combining the existing beam search and grammar sampling features. See the discussion in the referenced comment for more info