You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
presence_penalty: The penalty to apply to tokens based on their presence in the prompt.
1803
1813
repeat_penalty: The penalty to apply to repeated tokens.
1804
1814
top_k: The top-k value to use for sampling. Top-K sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
1815
+
top_n_sigma: Limit the next token selection to a subset of tokens with pre-softmax logits that are within n * σ less than the max logit (default: -1.00, -1.00 = disabled).
1805
1816
stream: Whether to stream the results.
1806
1817
seed: The seed to use for sampling.
1807
1818
tfs_z: The tail-free sampling parameter. Tail Free Sampling described in https://www.trentonbricken.com/Tail-Free-Sampling/.
1808
1819
mirostat_mode: The mirostat sampling mode.
1809
1820
mirostat_tau: The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
1810
1821
mirostat_eta: The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
1811
-
xtc-probability: Sets the chance for token removal (checked once on sampler start) (default: 0.0).
1812
-
xtc-threshold: Sets a minimum probability threshold for tokens to be removed (default: 0.1).
1822
+
xtc-probability: Sets the chance for token removal (checked once on sampler start) (default: 0.0). XTC sampler as described in https://github.com/oobabooga/text-generation-webui/pull/6335
1823
+
xtc-threshold: Sets a minimum probability threshold for tokens to be removed (default: 0.1). XTC sampler as described in https://github.com/oobabooga/text-generation-webui/pull/6335
1813
1824
model: The name to use for the model in the completion object.
1814
1825
stopping_criteria: A list of stopping criteria to use.
1815
1826
logits_processor: A list of logits processors to use.
@@ -1838,6 +1849,7 @@ def create_completion(
1838
1849
presence_penalty=presence_penalty,
1839
1850
repeat_penalty=repeat_penalty,
1840
1851
top_k=top_k,
1852
+
top_n_sigma=top_n_sigma,
1841
1853
stream=stream,
1842
1854
seed=seed,
1843
1855
tfs_z=tfs_z,
@@ -1874,6 +1886,7 @@ def __call__(
1874
1886
presence_penalty: float=0.0,
1875
1887
repeat_penalty: float=1.0,
1876
1888
top_k: int=40,
1889
+
top_n_sigma: float=-1.00,
1877
1890
stream: bool=False,
1878
1891
seed: Optional[int] =None,
1879
1892
tfs_z: float=1.0,
@@ -1905,6 +1918,7 @@ def __call__(
1905
1918
presence_penalty: The penalty to apply to tokens based on their presence in the prompt.
1906
1919
repeat_penalty: The penalty to apply to repeated tokens.
1907
1920
top_k: The top-k value to use for sampling. Top-K sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
1921
+
top_n_sigma: Limit the next token selection to a subset of tokens with pre-softmax logits that are within n * σ less than the max logit (default: -1.00, -1.00 = disabled).
1908
1922
stream: Whether to stream the results.
1909
1923
seed: The seed to use for sampling.
1910
1924
tfs_z: The tail-free sampling parameter. Tail Free Sampling described in https://www.trentonbricken.com/Tail-Free-Sampling/.
@@ -1941,6 +1955,7 @@ def __call__(
1941
1955
presence_penalty=presence_penalty,
1942
1956
repeat_penalty=repeat_penalty,
1943
1957
top_k=top_k,
1958
+
top_n_sigma=top_n_sigma,
1944
1959
stream=stream,
1945
1960
seed=seed,
1946
1961
tfs_z=tfs_z,
@@ -1966,6 +1981,7 @@ def create_chat_completion(
1966
1981
temperature: float=0.2,
1967
1982
top_p: float=0.95,
1968
1983
top_k: int=40,
1984
+
top_n_sigma: float=-1.00,
1969
1985
min_p: float=0.05,
1970
1986
typical_p: float=1.0,
1971
1987
stream: bool=False,
@@ -2002,6 +2018,7 @@ def create_chat_completion(
2002
2018
temperature: The temperature to use for sampling.
2003
2019
top_p: The top-p value to use for nucleus sampling. Nucleus sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
2004
2020
top_k: The top-k value to use for sampling. Top-K sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
2021
+
top_n_sigma: Limit the next token selection to a subset of tokens with pre-softmax logits that are within n * σ less than the max logit (default: -1.00, -1.00 = disabled).
2005
2022
min_p: The min-p value to use for minimum p sampling. Minimum P sampling as described in https://github.com/ggml-org/llama.cpp/pull/3841
2006
2023
typical_p: The typical-p value to use for sampling. Locally Typical Sampling implementation described in the paper https://arxiv.org/abs/2202.00666.
0 commit comments