Swin UNETR reaches memory limit in Tesla T4 #4837
Replies: 1 comment
-
Hi @fengling0410 , if the image input patch size is 96x96x96 or 128x128x128, it should work on 16G GPUs with batch size of 1. The swin transformer is larger than vanilla vision transformer. So the SwinUNETR used 8 layers in total instead of 12 as the encoder. To our observation, Swin transformer also has weigh larger GFLOPS, which increased the complexity. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all, I'm training Swin UNETR model using MONAI package these days but the training memory reaches limit when I was doing distributed training on four of 16GB Tesla T4 GPUs with a batch size of 1 and swin batch size of 1. Then I changed the Swin UNETR model to UNETR with other settings kept all the same, I found that UNETR model works and it only occupies half of the memory required by Swin UNETR. However, as far as I understand, the UNETR model (no.params: 92784046) is much larger than Swin UNETR (no.params: 62187296), and the computation is less complicated in Swin UNETR since it has linear complexity. I'm wondering if there was anything wrong with my implementation, or the Swin UNETR naturally occupies much larger memory during training compared with UNETR. May I ask for some advices regarding this? Thank you in advance :)
Beta Was this translation helpful? Give feedback.
All reactions