Replies: 1 comment
-
Hello @xhsoldier. Does training the model without quantization with the same schedule lead to the same problem? If you could provide a reproducer, we could look at this problem in more detail. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Int8 quantization aware training, loading a pretrained fp32 model with unbanlanced weight distribution, some weights are very big, some weight are very small.
After training 3 steps, nan occurs and the training failed.
How to resulve this issue?
Beta Was this translation helpful? Give feedback.
All reactions