llama : combined beam search + grammar sampling strategy

**Note: This issue was copied from [https://github.com/ggml-org/llama.cpp/issues/2923](https://github.com/ggml-org/llama.cpp/issues/2923)**

**Original Author:** @ggerganov
**Original Issue Number:** #2923
**Created:** 2023-08-31T06:29:29Z

---

This feature was proposed by @spion in https://github.com/ggerganov/llama.cpp/issues/2813#issuecomment-1694390583

> In some cases, its useful to do constrained evaluation of logits based on a union of possible text values, then pick the sum { logits } (i.e. product(probabilities)) that gives the most probable outcome overall.

> E.g. template (using MS guidance)

> {{#select 'armor'}}leather{{or}}chainmail{{or}}plate{{/select}}

> To definitely make the best choice, we'd need to calculate the probability of all 3 token sequences. Its easy if all the choices map to a single token, but with multiple tokens we'd need not just parallel generation but parallel logit evaluation of multiple possible paths.

> If we go greedy, we might get suboptimal results in cases multiple choices start with the same logit.

It should be possible to implement this by combining the existing beam search and grammar sampling features. See the discussion in the referenced comment for more info

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama : combined beam search + grammar sampling strategy #306

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

llama : combined beam search + grammar sampling strategy #306

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions