Replies: 1 comment 1 reply
-
Technically, yes, but there isn't any need for it. The distillation model u listed is a Dense structure, and using KT does not allow for any degree of acceleration.You can try the qwen3 family of MoE small-scale models, or simply switch to a different inference framework such as llama.cpp, vllm, etc.
No. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I do have download a full set of Deepseek-R1-Q4, and it works fine on my machine nicknamed "Fishbowl". There's a question however: the model is somewhat too big, not elegant enough for daily tasks.
So I wonder if I could load a distilled model from Deepseek like this, which is light enough even for my 6-year-old laptop to port with. And there's another question:
Is KTransformer supports regular .safetensors model?
for my knowledge, the answer is no - the arguments not includes one to specify you model path, only GGUFs instead. So any ideas?
Beta Was this translation helpful? Give feedback.
All reactions