Can deepspeed's zero_optimization achieve model parallelism? #5710
Unanswered
ojipadeson
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
My machine has 8 Tesla V100s. But when loading LLM (I loaded Qwen2-7B-Instruct), an OOM error will be reported when using a single card.
I used deepspeed to divide the model parameters into 8 cards, but it was always unsuccessful (OOM error). I don't know if it can easily implement this function?
my config.json
Beta Was this translation helpful? Give feedback.
All reactions