You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`-sp, --special`| special tokens output enabled (default: false) |
49
50
|`--spm-infill`| use Suffix/Prefix/Middle pattern for infill (instead of Prefix/Suffix/Middle) as some models prefer this. (default: disabled) |
50
51
|`--samplers SAMPLERS`| samplers that will be used for generation in the order, separated by ';'<br/>(default: top_k;tfs_z;typ_p;top_p;min_p;temperature) |
51
-
|`-s, --seed SEED`| RNG seed (default: -1, use random seed for < 0) |
52
+
|`-s, --seed SEED`| RNG seed (default: 4294967295, use random seed for 4294967295) |
52
53
|`--sampling-seq SEQUENCE`| simplified sequence for samplers that will be used (default: kfypmt) |
53
54
|`--ignore-eos`| ignore end of stream token and continue generating (implies --logit-bias EOS-inf) |
|`--mlock`| force system to keep model in RAM rather than swapping or compressing |
@@ -128,12 +129,13 @@ The project is under active development, and we are [looking for feedback and co
128
129
|`-sps, --slot-prompt-similarity SIMILARITY`| how much the prompt of a request must match the prompt of a slot in order to use that slot (default: 0.50, 0.0 = disabled)<br/> |
129
130
|`--lora-init-without-apply`| load LoRA adapters without applying them (apply later via POST /lora-adapters) (default: disabled) |
130
131
|`-ld, --logdir LOGDIR`| path under which to save YAML logs (no logging if unset) |
|`-v, --verbose, --log-verbose`| Set verbosity level to infinity (i.e. log all messages, useful for debugging) |
136
+
|`-lv, --verbosity, --log-verbosity N`| Set the verbosity threshold. Messages with a higher verbosity will be ignored.<br/>(env: LLAMA_LOG_VERBOSITY) |
137
+
|`--log-prefix`| Enable prefx in log messages<br/>(env: LLAMA_LOG_PREFIX) |
138
+
|`--log-timestamps`| Enable timestamps in log messages<br/>(env: LLAMA_LOG_TIMESTAMPS) |
137
139
138
140
Note: If both command line argument and environment variable are both set for the same param, the argument will take precedence over env var.
// if context shift is disabled, we make sure prompt size is smaller than KV size
1983
+
if ((int) system_tokens.size() + slot.n_prompt_tokens >= slot.n_ctx) {
1984
+
slot.release();
1985
+
send_error(slot, "the request exceeds the available context size. try increasing the context size or enable context shift", ERROR_TYPE_INVALID_REQUEST);
0 commit comments