Skip to content

Conversation

bandoti
Copy link
Collaborator

@bandoti bandoti commented Oct 16, 2025

This change adds a "partial formatter" that processes partially collected messages (like the server streaming logic) in order to render reasoning logic prior to EOG token arrival.

In addition, the chat_add_and_format lambda has been moved to a functor, and this now calls common_chat_templates_apply directly to allow more robust template-application options.

Logic has been put in place to suppress the system/prompt tags to clean up output. I am thinking in a separate PR I can work on fixing the colorization (e.g. blue for system prompt).

Example output :

./build/bin/llama-cli.exe -m ./models/gpt-oss-20b-mxfp4.gguf -c 2048 -sys "you are a wizard" -p "please recite me a haiku about llamas" --jinja

...

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to the AI.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

you are a wizard
please recite me a haiku about llamas
The user wants a haiku about llamas. The developer instruction: "you are a wizard". That seems to be a role instruction to the assistant. The user wants a haiku. We should comply. There's no conflict. We can produce a haiku, 5-7-5 syllable structure, about llamas. Also maybe incorporate wizard? But user didn't ask wizard. The developer says "you are a wizard" - maybe we need to incorporate that in style. But the user didn't request anything about wizard. We could just produce a haiku. It's allowed. There's no policy conflict. So just produce a haiku: "Llama steps, soft hum / desert wind whispers, calm, wise / moonlit paws echo". Let's check syllable counts: Llama steps (2) soft (1) hum (1) = 4? "Llama steps, soft hum" -> Lla-ma (2), steps (1), soft (1), hum (1) = 5. Next line: "desert wind whispers, calm, wise" -> de-sert (2), wind (1), whis-pers (2), calm (1), wise (1) = 7? de-sert (2) wind (1) whis-pers (2) calm (1) wise (1) = 7. Third: "moonlit paws echo" -> moon-lit (2) paws (1) e-cho (2) = 5. Good.

We can present that. Possibly add a wizard vibe: "Wizardly eyes gleam". But we must keep structure. We could do: "Wizard llama roams" but okay.

Let's produce a haiku.
Llama steps, soft hum,
Desert wind whispers, calm, wise—
Moonlit paws echo.

>

@bandoti bandoti requested a review from ggerganov as a code owner October 16, 2025 01:32
@bandoti bandoti requested review from CISC and ggerganov and removed request for ggerganov October 16, 2025 01:32
@bandoti
Copy link
Collaborator Author

bandoti commented Oct 16, 2025

I just updated to clean up the system/prompt tags (see description changes), but I will await feedback before changing anything else! 😊

One thing I was contemplating was splitting the display block into a separate abstraction. The display could be a new type because more state was added here, it might be a good time to do refactors like this to encapsulate functionality incrementally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant