Regarding issue with output format for Vicuna (lmsys/vicuna-7b-v1.5) using fastchat/serve/huggingface_api.py

Hi!

I was using the [huggingface_api](https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/huggingface_api.py) for inference on `lmsys/vicuna-7b-v1.5` using the following command:

`python3 -m fastchat.serve.huggingface_api --model lmsys/vicuna-7b-v1.5`

The **ASSISTANT's** output looks like (with the special characters "▁" and additional spaces):

> USER: Hello! Who are you?
> **ASSISTANT**: ▁I ' m ▁a ▁language ▁model ▁called ▁Vic una , ▁and ▁I ▁was ▁trained ▁by ▁Lar ge ▁Model ▁Systems ▁Organ ization ▁( L MS YS ) ▁research ers .

However, I was expecting the output to be clean:

> USER: Hello! Who are you?
> **ASSISTANT**: I'm a language model called Vicuna , and I was trained by Large Model Systems Organization (LMSYS) researchers.

I need to have clean output because I am performing multi-turn generation (i.e. pass the first response of the assistant back to the assistant as context for generating next response).

Sorry if I am missing something fundamental here but any help would be much appreciated!

<img width="871" height="143" alt="Image" src="https://github.com/user-attachments/assets/42f60c6c-aef8-4853-abb3-fdf5b619f0d3" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding issue with output format for Vicuna (lmsys/vicuna-7b-v1.5) using fastchat/serve/huggingface_api.py #3795

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Regarding issue with output format for Vicuna (lmsys/vicuna-7b-v1.5) using fastchat/serve/huggingface_api.py #3795

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions