"resume from checkpoint" lead to CUDA out of memory #11563
-
When I use “resume from checkpoint”, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
@Defiler24 Could you share which training strategy(plugin) are you using? Or could you share your code here |
Beta Was this translation helpful? Give feedback.
-
I solved the problem after setting the strategy to 'ddp'. |
Beta Was this translation helpful? Give feedback.
-
I have exactly the same issue, the only difference is that my model is trained on single GPU. I did not specify any stratgey as well, is there any solution to solve it? Thanks |
Beta Was this translation helpful? Give feedback.
I solved the problem after setting the strategy to 'ddp'.