Generate a list of the most likely answers #2788
Replies: 3 comments 1 reply
-
#2489 got merged couple of hours ago, and with it you can see token probabilities in If you want a list of "complete" possible answers, like full sentences, that list would grow exponentially with each token and approach infinity :) You could possibly do some filtering and exclude tokens with very low probability to reduce possible "branches", but that would conflict your requirement of completeness. Imagine there are, say, 100 words which could be the first word of the output, and let's assume there are another 100 possible second words, etc. With just 3 words, your list of possible answers would be 100*100*100 = 1milion possibilities long. If you only consider like 10 most probable candidate words, you are again up to a million after 6 words. If you only consider 3 candidates, you are up to a million after 13 words. |
Beta Was this translation helpful? Give feedback.
-
I do not want to consider the 10 most probable tokens, but the 10 most probable overall outputs. |
Beta Was this translation helpful? Give feedback.
-
So apparently @mattpulver implemented what I was thinking about c82742a |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
There already are the temperature, top-k and top-p sampling.
However in some use cases it would be really helpful to generate the most likely responses.
The setup is that there is a fixed input, and I want a list of the most likely completions (until a EOS, reverse prompt or fixed length)
I thought about sampling the response by running ./main a few hundred times and comparing the responses.
However this requires processing the input multiple times (which can be somewhat fixed by using the prompt cache), but also might generate the same answer multiple times (inefficient) or miss a very likely answer (incomplete).
I think the ideal solution would be to generate a list of the most likely answers with their cumulative probability.
The first entry would be the one with temperature set to 0.
Is anyone aware of any solution to this in llama.cpp or an efficient program that is able to achieve this?
This is related to #184 and https://huggingface.co/blog/how-to-generate
Beta Was this translation helpful? Give feedback.
All reactions