Skip to content

Conversation

@ggerganov
Copy link
Member

Noticed that the existing logic of llama-simple-chat adds BOS token at the start of each chat message, which is generally incorrect.

This is a sample fix, though probably a better fix could be implemented.

Repro:

make -j && ./bin/llama-simple-chat -m ../models/llama-8b-v3-instruct/ggml-model-f16.gguf 

> Hello
Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?
> Count to 3
1, 2, 3!

(on master this does not produce a reply on the second message)

@ggerganov ggerganov requested a review from slaren January 17, 2025 10:34
@ggerganov ggerganov marked this pull request as ready for review January 17, 2025 12:10
Copy link
Collaborator

@ericcurtin ericcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change needs to go in run.cpp also, it got carried over from here

@ggerganov ggerganov merged commit b9daaff into master Jan 19, 2025
48 checks passed
@ggerganov ggerganov deleted the gg/simple-chat-fix-bos branch January 19, 2025 16:12
@ericcurtin
Copy link
Collaborator

#11302

anagri pushed a commit to BodhiSearch/llama.cpp that referenced this pull request Jan 26, 2025
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants