ik_llama.cpp for Armv8.0 #556
Replies: 4 comments 2 replies
-
Nice 😄 The repacked variants don't work because the emulation for |
Beta Was this translation helpful? Give feedback.
-
I did a fresh recompile and repacking works now! Unfortunately IQ4_KT still doesn't work :( |
Beta Was this translation helpful? Give feedback.
-
The |
Beta Was this translation helpful? Give feedback.
-
Yes, the But nice you have made all this work! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I managed to port ik_llama.cpp to my phone which has a Snapdragon 680 CPU. Although under heavy emulation, it's still much faster than mainline llama.cpp. All of the tests are done using Qwen 3 0.6B model.

What works:
What doesn't work:
If anyone is interested, I'll publish a fork. It just adds emulation for some NEON dot product and float16 arithmetic intrinsics. (mainline also has some level of emulation for v8.0)
Beta Was this translation helpful? Give feedback.
All reactions