Skip to content

Conversation

@Ki-Seki
Copy link
Contributor

@Ki-Seki Ki-Seki commented Dec 3, 2025

Background

In our unit tests, we widely use the model M4-ai/TinyMistral-248M-v2-Instruct-GGUF as the LlamaCpp test model.

However, because this model was released early, its GGUF metadata does not contain a chat_template:

Despite this, the model is actually a chat model, and according to its documentation we should apply the correct chat template when performing chat completions:

However, Llama.cpp defaults to the llama-2 chat format:

The correct format for this model in Llama.cpp is actually qwen:

Therefore, the model should be loaded with:

from llama_cpp import Llama
from outlines import from_llamacpp

llamacpp_model = Llama.from_pretrained(
    repo_id="M4-ai/TinyMistral-248M-v2-Instruct-GGUF",
    filename="TinyMistral-248M-v2-Instruct.Q4_K_M.gguf",
+    chat_format="qwen",
)

model = from_llamacpp(llamacpp_model)

Why this issue surfaces now

Previously, we did not emphasize chat completion behavior, so this problem did not appear in tests.
Under the new best-effort chat completion strategy, the incorrect chat template becomes visible.

For example, the unit test, pytest tests/models/test_llamacpp.py::test_llamacpp_json fails immediately if the wrong chat template is applied. As you can see, the model's behavior will be erratic due to an incorrect template:

image image

@Ki-Seki
Copy link
Contributor Author

Ki-Seki commented Dec 3, 2025

Hi Robin, while working on Issue #1784 / PR #1789 , I encountered a minor issue. To keep PR #1789 focused, I’m submitting this small separate PR for your review. Thx~ 🤗@RobinPicard

Copy link
Contributor

@RobinPicard RobinPicard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, thanks a lot!

@RobinPicard RobinPicard merged commit 2fa3777 into dottxt-ai:main Dec 3, 2025
5 checks passed
@Ki-Seki
Copy link
Contributor Author

Ki-Seki commented Dec 3, 2025

You’re welcome, my friend! It's a pleasure to contribute. ✌️

@RobinPicard
Copy link
Contributor

If you have some free time I'd love to have a chat with you about your use of Outlines. I sent you a connection request on LinkedIn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants