-
Notifications
You must be signed in to change notification settings - Fork 156
Closed
Description
What happened?
I use server with the --jinja flag and set a preset and stopping strings. It's not the same chat preset as the one in the metadata. My client (sillytavern) correctly stops streaming but the server keeps generating in the background. I have to manually send a stop command to the server.
I also tried disabling streaming and the behavior is the same. Just goes till the max output tokens is reached, ignoring all custom stopping strings.
Name and Version
latest GIT
What operating system are you seeing the problem on?
Linux
Relevant log output
INFO [ print_timings] generation eval time = 175058.09 ms / 2048 runs ( 85.48 ms per token, 11.70 tokens per second) | tid="139933603020800" timestamp=1761227346 id_slot=0 id_task=12620 t_token_generation=175058.094 n_decoded=2048 t_token=85.4775849609375 n_tokens_second=11.698973484767862
INFO [ print_timings] total time = 178251.15 ms | tid="139933603020800" timestamp=1761227346 id_slot=0 id_task=12620 t_prompt_processing=3193.055 t_token_generation=175058.094 t_total=178251.149
Actual reply was only 650 tokens before it emitted <im-end>Metadata
Metadata
Assignees
Labels
No labels