DynUnet inconsistency #5048
Unanswered
AAttarpour
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Hi @yiheng-wang-nv , Could you please help share some best practices for this question? Thanks in advance. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello MONAI team,
Based on previous discussion #4851, the problem of DynUnet is solved. In the monai-weekly version 0.10.dev2235, there are dropout layers in the DynUnet model. However, I have another problem with this new model. I use 3D input with the size of 128x128x128. Using the previous DynUnet in the monai-weekly version 0.9.dev2214, I could use a batch size of 24 on our GPU server. Here is how I defined the model:
However, with the same inputs, model, and GPU server I cannot use a batch size of more than 3 in this new version. If I increase it to 24 (what I had before), it gives me random errors; errors such as:
RuntimeError: Unable to find a valid cuDNN algorithm to run convolution
or
KeyError: Caught KeyError in DataLoader worker process 1.
This difference in batch size is not logical. Is there something wrong with my virtual env? Or is it sth wrong with the model?
It should be noted that for both of them I use torch version 1.10; updating torch to version 1.12 didn't solve the problem. I would really appreciate it if someone can help me.
Beta Was this translation helpful? Give feedback.
All reactions