Skip to content

Commit c2c1949

Browse files
committed
can set start-string multiple times, doc
1 parent 8c319eb commit c2c1949

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

common/arg.cpp

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2847,10 +2847,9 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
28472847
).set_examples({LLAMA_EXAMPLE_SERVER, LLAMA_EXAMPLE_MAIN}).set_env("LLAMA_ARG_THINK"));
28482848
add_opt(common_arg(
28492849
{"--start-string"}, "STRING",
2850-
"Start outputting tokens only when the start string has been reached",
2850+
"Start outputting tokens only when at least one start string has been reached. Can be set multiple times.",
28512851
[](common_params & params, const std::string & value) {
2852-
params.start_strings.resize(1);
2853-
params.start_strings[0] = value;
2852+
params.start_strings.push_back(value);
28542853
}
28552854
).set_examples({LLAMA_EXAMPLE_SERVER}).set_env("LLAMA_ARG_START_STRING"));
28562855
add_opt(common_arg(

examples/server/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,7 @@ The project is under active development, and we are [looking for feedback and co
160160
| `--props` | enable changing global properties via POST /props (default: disabled)<br/>(env: LLAMA_ARG_ENDPOINT_PROPS) |
161161
| `--no-slots` | disables slots monitoring endpoint<br/>(env: LLAMA_ARG_NO_ENDPOINT_SLOTS) |
162162
| `--slot-save-path PATH` | path to save slot kv cache (default: disabled) |
163+
| `--start-string STRING` | The response is not sent to client until one start string is reached. Can be set multiple times |
163164
| `--chat-template JINJA_TEMPLATE` | set custom jinja chat template (default: template taken from model's metadata)<br/>if suffix/prefix are specified, template will be disabled<br/>list of built-in templates:<br/>chatglm3, chatglm4, chatml, command-r, deepseek, deepseek2, exaone3, gemma, granite, llama2, llama2-sys, llama2-sys-bos, llama2-sys-strip, llama3, minicpm, mistral-v1, mistral-v3, mistral-v3-tekken, mistral-v7, monarch, openchat, orion, phi3, rwkv-world, vicuna, vicuna-orca, zephyr<br/>(env: LLAMA_ARG_CHAT_TEMPLATE) |
164165
| `-sps, --slot-prompt-similarity SIMILARITY` | how much the prompt of a request must match the prompt of a slot in order to use that slot (default: 0.50, 0.0 = disabled)<br/> |
165166
| `--lora-init-without-apply` | load LoRA adapters without applying them (apply later via POST /lora-adapters) (default: disabled) |

0 commit comments

Comments
 (0)