rot_rand in SWIN transformer pre-training #4727
-
Hi all, I'm confused about the rot_rand part of the Swin transformer pre-training when trying to reproduce the result. The rot_rand code is as the following:
As far as I understand, x_s has the shape of (batch_size, no_channels, H, W, D), and in |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hi Fengling0410, |
Beta Was this translation helpful? Give feedback.
Hi Fengling0410,
Thanks for your message. The axis tuple for rot90 is the rotation axis, and you are right, if your data is in [B, C, H, W, D], the rotation z axis will be set dimension [1,2] after extracting data point B. To our experience, the rotation prediction is an easier task as pre-texts. You can enable all 9 directions, it will be better.
Thanks.