Replies: 1 comment
-
@v-dicicco , sorry for the vagueness. It does work with the non-linearized one. The issue was due to how the modeling code was written, it doesn't work well with bitsandbytes (lora/qlora) leading to less vram savings. However, if you're doing fft, it's not a concern for you! Just a note too, we haven't tested llama4 for a while, so not really sure if something broke. If you want to give it a try, maybe try the docker images that were built at the time https://hub.docker.com/r/axolotlai/axolotl-cloud/tags?name=20250511 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I found a bit confusing the state of the support of LLama4 inside Axolotl and I would like to have a confirm: does it support the HF model NOT linearized? e.g: meta-llama/Llama-4-Scout-17B-16E-Instruct
I see that all the
examples
uses the linearized version, and also the readme says "[...] See examples to start training your own Llama 4 models with Axolotl's linearized version!" not sure if the repo just lacks examples (because you will need more GPU) or if there is an incompatibility.Thanks!
Beta Was this translation helpful? Give feedback.
All reactions