how much vram could LLaMa 3 400B model require to be trained for chinese llama type training ? #563
StephennFernandes
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there just wanted to know the estimates on how much VRAM could be needed to train chinese llama type training on the 400B llama model ? as extending the tokenizer also extends the vocab_size in the model parameters would be needing to account for those values as well.
i currently have 8x A6000's wanted to know if these could suffice ? additionally can i load the model in 4 bit and train the training script.
Beta Was this translation helpful? Give feedback.
All reactions