Skip to content

Commit f8f15c4

Browse files
authored
Update Mirostat sampler function parameters
corrected parameters description in docs
1 parent 7057faf commit f8f15c4

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

include/llama.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1186,11 +1186,11 @@ extern "C" {
11861186
LLAMA_API struct llama_sampler * llama_sampler_init_top_n_sigma(float n);
11871187

11881188
/// @details Mirostat 1.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
1189-
/// @param candidates A vector of `llama_token_data` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
1189+
/// @param n_vocab The number of tokens in the vocabulary
1190+
/// @param seed The sampler seed
11901191
/// @param tau The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
11911192
/// @param eta The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
11921193
/// @param m The number of tokens considered in the estimation of `s_hat`. This is an arbitrary value that is used to calculate `s_hat`, which in turn helps to calculate the value of `k`. In the paper, they use `m = 100`, but you can experiment with different values to see how it affects the performance of the algorithm.
1193-
/// @param mu Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (`2 * tau`) and is updated in the algorithm based on the error between the target and observed surprisal.
11941194
LLAMA_API struct llama_sampler * llama_sampler_init_mirostat(
11951195
int32_t n_vocab,
11961196
uint32_t seed,
@@ -1199,10 +1199,10 @@ extern "C" {
11991199
int32_t m);
12001200

12011201
/// @details Mirostat 2.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
1202+
/// @param seed The sampler seed
12021203
/// @param candidates A vector of `llama_token_data` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
12031204
/// @param tau The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
12041205
/// @param eta The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
1205-
/// @param mu Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (`2 * tau`) and is updated in the algorithm based on the error between the target and observed surprisal.
12061206
LLAMA_API struct llama_sampler * llama_sampler_init_mirostat_v2(
12071207
uint32_t seed,
12081208
float tau,

0 commit comments

Comments
 (0)