Generate a list of the most likely answers #2788

nickfreeman-de · 2023-08-25T15:34:05Z

nickfreeman-de
Aug 25, 2023

There already are the temperature, top-k and top-p sampling.
However in some use cases it would be really helpful to generate the most likely responses.
The setup is that there is a fixed input, and I want a list of the most likely completions (until a EOS, reverse prompt or fixed length)
I thought about sampling the response by running ./main a few hundred times and comparing the responses.
However this requires processing the input multiple times (which can be somewhat fixed by using the prompt cache), but also might generate the same answer multiple times (inefficient) or miss a very likely answer (incomplete).
I think the ideal solution would be to generate a list of the most likely answers with their cumulative probability.
The first entry would be the one with temperature set to 0.
Is anyone aware of any solution to this in llama.cpp or an efficient program that is able to achieve this?
This is related to #184 and https://huggingface.co/blog/how-to-generate

staviq · 2023-08-25T16:42:45Z

staviq
Aug 25, 2023

#2489 got merged couple of hours ago, and with it you can see token probabilities in server

If you want a list of "complete" possible answers, like full sentences, that list would grow exponentially with each token and approach infinity :)

You could possibly do some filtering and exclude tokens with very low probability to reduce possible "branches", but that would conflict your requirement of completeness.

Imagine there are, say, 100 words which could be the first word of the output, and let's assume there are another 100 possible second words, etc.

With just 3 words, your list of possible answers would be 100*100*100 = 1milion possibilities long.

If you only consider like 10 most probable candidate words, you are again up to a million after 6 words. If you only consider 3 candidates, you are up to a million after 13 words.

0 replies

nickfreeman-de · 2023-08-25T17:36:22Z

nickfreeman-de
Aug 25, 2023
Author

I do not want to consider the 10 most probable tokens, but the 10 most probable overall outputs.
The "completeness" just meant that it is possible to miss the most probable answer even if sampling a million times with a non-zero temperature.
The expected output would be a list of outputs, with the overall most probable answer as the first and the second most probable answer as second.
This should be achievable using a depth-first search without needing to go through many branches.
The linked article has an explanation for what I mean

1 reply

KerfuffleV2 Aug 25, 2023
Collaborator

I do not want to consider the 10 most probable tokens, but the 10 most probable overall outputs.

The model generates output token by token though, so you don't know the probability for the answer only the next token. You also can't do something like generate an answer and add up the probabilities for the tokens that comprised the answer because that doesn't necessarily have anything to do with the likelihood of the answer.

I think you might need to use some kind of sampling approach but if you write custom code with the llama.cpp API (or just do something like make a copy of the main example and make some changes) there's probably a way to reduce the number of possibilities quite a bit. Just for example, if you find the top-k however many tokens right after evaluating the prompt and you want to find the top 10 most probable answers they'd have to start from the top 10 next tokens at that point. After that they could diverge in various ways but this cuts down the number of possibilities quite a bit.

nickfreeman-de · 2023-09-01T10:07:46Z

nickfreeman-de
Sep 1, 2023
Author

So apparently @mattpulver implemented what I was thinking about c82742a

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generate a list of the most likely answers #2788

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Generate a list of the most likely answers #2788

Uh oh!

Uh oh!

nickfreeman-de Aug 25, 2023

Replies: 3 comments · 1 reply

Uh oh!

Uh oh!

staviq Aug 25, 2023

Uh oh!

Uh oh!

nickfreeman-de Aug 25, 2023 Author

Uh oh!

KerfuffleV2 Aug 25, 2023 Collaborator

Uh oh!

nickfreeman-de Sep 1, 2023 Author

nickfreeman-de
Aug 25, 2023

Replies: 3 comments 1 reply

staviq
Aug 25, 2023

nickfreeman-de
Aug 25, 2023
Author

KerfuffleV2 Aug 25, 2023
Collaborator

nickfreeman-de
Sep 1, 2023
Author