-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Description
Phi-4 is now available on HF:
When trying to run the Phi-4 GGUF using the existing quantized_phi3 implementation I get:
loaded 243 tensors (9.05GB) in 0.27s
model built
Error: shape mismatch in reshape, lhs: [1, 12, 1280], rhs: [1, 12, 40, 128]
Write a function to count prime numbers up to N. %
When trying to run the Phi-4 GGUF using the quantized_llama implementation I get:
loaded 243 tensors (9.05GB) in 0.30s
Error: cannot find llama.attention.head_count in metadata
These are reproducible via the quantized-phi examples by just swapping the models with:
"microsoft/phi-4-gguf",
"phi-4-q4.gguf",
"main",
It would be great to have Phi-4 support.
Metadata
Metadata
Assignees
Labels
No labels