You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: spring-ai-docs/src/main/antora/modules/ROOT/pages/api/chat/ollama-chat.adoc
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -119,6 +119,7 @@ The remaining `options` properties are based on the link:https://github.com/olla
119
119
| spring.ai.ollama.chat.options.num-predict | Maximum number of tokens to predict when generating text. (-1 = infinite generation, -2 = fill context) | -1
120
120
| spring.ai.ollama.chat.options.top-k | Reduces the probability of generating nonsense. A higher value (e.g., 100) will give more diverse answers, while a lower value (e.g., 10) will be more conservative. | 40
121
121
| spring.ai.ollama.chat.options.top-p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. | 0.9
122
+
| spring.ai.ollama.chat.options.min-p | Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter p represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with p=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. | 0.0
122
123
| spring.ai.ollama.chat.options.tfs-z | Tail-free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. | 1.0
| spring.ai.ollama.chat.options.repeat-last-n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | 64
Copy file name to clipboardExpand all lines: spring-ai-docs/src/main/antora/modules/ROOT/pages/api/embeddings/ollama-embeddings.adoc
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -124,6 +124,7 @@ The remaining `options` properties are based on the link:https://github.com/olla
124
124
| spring.ai.ollama.embedding.options.num-predict | Maximum number of tokens to predict when generating text. (-1 = infinite generation, -2 = fill context) | -1
125
125
| spring.ai.ollama.embedding.options.top-k | Reduces the probability of generating nonsense. A higher value (e.g., 100) will give more diverse answers, while a lower value (e.g., 10) will be more conservative. | 40
126
126
| spring.ai.ollama.embedding.options.top-p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. | 0.9
127
+
| spring.ai.ollama.embedding.options.min-p | Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter p represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with p=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. | 0.0
127
128
| spring.ai.ollama.embedding.options.tfs-z | Tail-free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. | 1.0
| spring.ai.ollama.embedding.options.repeat-last-n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | 64
0 commit comments