Update how-to-deploy-models-mistral.md

shubhirajMsft · web-flow · commit 31490d2a8258 · 2024-02-24T13:41:19.000-08:00
diff --git a/articles/machine-learning/how-to-deploy-models-mistral.md b/articles/machine-learning/how-to-deploy-models-mistral.md
@@ -132,7 +132,7 @@ Payload is a JSON formatted string containing the following parameters:
 |-----|-----|-----|-----|
 | `messages`    | `string`  | No default. This value must be specified.  | The message or history of messages to use to prompt the model.  |
 | `stream`      | `boolean` | `False` | Streaming allows the generated tokens to be sent as data-only server-sent events whenever they become available.  |
-| `max_tokens`  | `integer` | `1024`    | The maximum number of tokens to generate in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length. |
+| `max_tokens`  | `integer` | `8192`    | The maximum number of tokens to generate in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length. |
 | `top_p`       | `float`   | `1`     | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with `top_p` probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering `top_p` or `temperature`, but not both.  |
 | `temperature` | `float`   | `1`     | The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this or `top_p`, but not both.  |
 | `ignore_eos`          | `boolean` | `False`  | Whether to ignore the EOS token and continue generating tokens after the EOS token is generated. |