How to finetune DeepSeek-v2 236B #1739
-
Thank you for the great repo. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
We have a qlora fsdp config. You can give it a try here but change the model https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/deepseek-v2/qlora-fsdp-2_5.yaml . I'm not sure if we tested that moe model yet and require any specific change. |
Beta Was this translation helpful? Give feedback.
-
I would also love to see an example that works. I've been unsuccessful at this. |
Beta Was this translation helpful? Give feedback.
We have added example configs for
deepseek_v2
models. Since it works for the smaller Lite model, it should work for the larger too. https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/deepseek-v2