Skip to content

Bug: [server] Text completions keep goin and goin. #859

@Ph0rk0z

Description

@Ph0rk0z

What happened?

I use server with the --jinja flag and set a preset and stopping strings. It's not the same chat preset as the one in the metadata. My client (sillytavern) correctly stops streaming but the server keeps generating in the background. I have to manually send a stop command to the server.

I also tried disabling streaming and the behavior is the same. Just goes till the max output tokens is reached, ignoring all custom stopping strings.

Name and Version

latest GIT

What operating system are you seeing the problem on?

Linux

Relevant log output

INFO [           print_timings] generation eval time =  175058.09 ms /  2048 runs   (   85.48 ms per token,    11.70 tokens per second) | tid="139933603020800" timestamp=1761227346 id_slot=0 id_task=12620 t_token_generation=175058.094 n_decoded=2048 t_token=85.4775849609375 n_tokens_second=11.698973484767862
INFO [           print_timings]           total time =  178251.15 ms | tid="139933603020800" timestamp=1761227346 id_slot=0 id_task=12620 t_prompt_processing=3193.055 t_token_generation=175058.094 t_total=178251.149


Actual reply was only 650 tokens before it emitted <im-end>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions