training on gpu becomes non-deterministic #11250
-
I usually set seed_everything() at the beginning of my script but this does not always solves the problem. When I train models on cpu, it is determinstic; but when I switch to gpu, it becomes non-deterministic. When I am training simple model, like one-layer LSTM, it is deterministic both on cpu and gpu. But when I train a more completed model like LSTM-FCN, it is deterministic on cpu but not on gpu Can I get any help on debugging? I got my LSTM-FCN model from here (https://github.com/timeseriesAI/tsai/blob/main/tsai/models/RNN_FCN.py) and the LSTM model I tested was simply a nn.LSTM |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
maybe setting |
Beta Was this translation helpful? Give feedback.
-
Wow, it worked. This is amazing. I thought seed_everything() has done everything pytorch lightning could do. Is there any documentation for the Trainer(deterministic=True) you mentioned? Want to take a look. |
Beta Was this translation helpful? Give feedback.
maybe setting
Trainer(deterministic=True)
might help?