Skip to content

Conversation

@kallewoof
Copy link

ChatML (Phi 4)                  = ChatML (Phi 4)                  : OK      microsoft/phi-4                             
ChatML (Qwen 2.5 based)         = ChatML (Qwen 2.5 based)         : OK      Qwen/Qwen2.5-0.5B-Instruct                  
ChatML (Kimi)                   = ChatML (Kimi)                   : OK      moonshotai/Kimi-K2-Instruct                 
Google Gemma 2                  = Google Gemma 2                  : OK      Efficient-Large-Model/gemma-2-2b-it         
Google Gemma 3                  = Google Gemma 3                  : OK      google/gemma-3-4b-it                        
Google Gemma 3n                 = Google Gemma 3n                 : OK      lmstudio-community/gemma-3n-E4B-it-MLX-bf16 
Llama 3.x                       = Llama 3.x                       : OK      Steelskull/L3.3-Shakudo-70b                 
Llama 4                         = Llama 4                         : OK      nvidia/Llama-4-Scout-17B-16E-Instruct-FP8   
Mistral V7 (with system prompt) = Mistral V7 (with system prompt) : OK      Doctor-Shotgun/MS3.2-24B-Magnum-Diamond     
Mistral V3                      = Mistral V3                      : missing expected fragment [/INST]asst_1</s>: [/INST] asst_1</s>[INST] user_2[/INST] mistralai/Mistral-7B-Instruct-v0.3          
GLM-4                           = GLM-4                           : OK      THUDM/glm-4-9b-chat-hf                      
Phi 3.5                         = Phi 3.5                         : OK      microsoft/Phi-3.5-mini-instruct             
Phi 4 (mini)                    = Phi 4 (mini)                    : system role missing expected fragment <|system|>\nSyS-tEm<|end|>\n: <|system|>SyS-tEm<|end|><|user|>user<|end|><|endoftext|> microsoft/Phi-4-mini-instruct               
Cohere (Aya Expanse 32B based)  = Cohere (Aya Expanse 32B based)  : OK      CohereLabs/aya-expanse-32b                  [default]
DeepSeek V2.5                   = DeepSeek V2.5                   : OK      deepseek-ai/DeepSeek-V2.5                   
Jamba                           = Jamba                           : system role missing expected fragment <|bom|><|system|>SyS-tEm<|eom|>: <|startoftext|><|bom|><|system|> SyS-tEm<|eom|><|bom|><|user|> user<|eom|> ai21labs/Jamba-tiny-dev                     
Dots                            = Dots                            : system role missing expected fragment <|system|>\nSyS-tEm<|endofsystem|>\n: <|system|>SyS-tEm<|endofsystem|><|userprompt|>user<|endofuserprompt|> rednote-hilab/dots.llm1.inst                
RWKV World                      = MISSING                         : ? fla-hub/rwkv7-1.5B-world                    
Mistral (Generic)               = Mistral (Generic)               : missing expected fragment [/INST]\nasst_1</s>: [/INST]asst_1</s>[INST]user_2[/INST] mistralai/Mistral-Nemo-Instruct-2407        
ChatML (Generic)                = ChatML (Generic)                : OK      NewEden/Gemma-27B-chatml                    
There were 6 failure(s)!

Some of these failures are, I believe, actual issues with adapters. I don't know how big of an impact they have, and/or if it's the chat templates that are actually broken, but even then, I think aligning and then fixing is the right course of action.

This builds on top of, and includes, #1650. It should probably not be merged before that one.



{
}, {
Copy link
Author

@kallewoof kallewoof Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I intentionally put this in here to ensure people didn't just stack more adapters at the end. That said, with these tests, it should error on the right occasions so no big deal.

@kallewoof kallewoof mentioned this pull request Jul 20, 2025
@kallewoof kallewoof marked this pull request as draft July 20, 2025 13:20
@kallewoof kallewoof force-pushed the 202507-autoguess-adapter-tests branch from 33ffb10 to 7b72816 Compare July 20, 2025 13:40
@LostRuins
Copy link
Owner

You might want to rebase this off latest.

I am ok with the rwkv keyword change. I am not even sure what the correct one is - half the so called rwkv models are just using chatml so I suspect the template isnt even valid and should be removed.

@kallewoof kallewoof force-pushed the 202507-autoguess-adapter-tests branch from 7b72816 to 8193e11 Compare July 25, 2025 13:21
@kallewoof kallewoof marked this pull request as ready for review July 25, 2025 13:25
@kallewoof
Copy link
Author

kallewoof commented Jul 25, 2025

You might want to rebase this off latest.

Done. Also updated kallewoof#1 to run rebased variant.

I am ok with the rwkv keyword change. I am not even sure what the correct one is - half the so called rwkv models are just using chatml so I suspect the template isnt even valid and should be removed.

I think that will work fine, especially if we include e.g. "User:" in the search strings. ChatML content will be matched with the ChatML adapter(s).

@kallewoof kallewoof mentioned this pull request Jul 25, 2025
Copy link
Owner

@LostRuins LostRuins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will merge this first while we reexamine the templates

@LostRuins
Copy link
Owner

One thing I have learned previously: Not all official templates are ideal either. One major reason we deviated from using Jinja is because often times our own tried-and-tested templates outperform the official ones. Just something to keep in mind.

@LostRuins LostRuins merged commit b7b3e0d into LostRuins:concedo_experimental Jul 25, 2025
@kallewoof kallewoof deleted the 202507-autoguess-adapter-tests branch July 25, 2025 14:16
@kallewoof
Copy link
Author

One thing I have learned previously: Not all official templates are ideal either. One major reason we deviated from using Jinja is because often times our own tried-and-tested templates outperform the official ones. Just something to keep in mind.

Absolutely. I think knowing when we deviate is a good idea though. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants