Replies: 1 comment
-
There are many reasons causing nan in FP16 training. You can refer to PyTorch documentation for advices. MMEngine uses PyTorch |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When I use FP16(pytorch amp API), I meet loss nan bug.
So how about your AMP code? Does it can fix loss_nan? Or do you meet loss_nan in amp training ?How do you fix this problem?
Thank you for your answer.
Beta Was this translation helpful? Give feedback.
All reactions