Why does the self-supervised pre-training tutorial create two cropped views of the input image? #6659

ArjunNarayanan · 2023-06-26T14:45:02Z

ArjunNarayanan
Jun 26, 2023

Hi,

I'm taking a look at the self-supervised pre-training tutorial. I was wondering about the data-augmentation in the transform pipeline. Specifically, the tutorial samples two cropped views of the input image,

RandSpatialCropSamplesd(keys=["image"], roi_size=(96, 96, 96), random_size=False, num_samples=2)

I was wondering what is the purpose of this? It seems like this effectively doubles the batch-size where every subsequent image in the batch is drawn from the same input image. Thanks for any insights you can share on this!

tangy5 · 2023-06-26T14:47:58Z

tangy5
Jun 26, 2023
Collaborator

This arg can be different, if GPU mem allows, it can be set to larger num samples or 1, but remember the contrastive learning might be impacted by the num sample and batch size.

Thank you.

1 reply

ArjunNarayanan Jun 26, 2023
Author

I see. But by doing it this way, won't the contrastive loss treat these two randomly cropped samples as negative samples?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why does the self-supervised pre-training tutorial create two cropped views of the input image? #6659

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why does the self-supervised pre-training tutorial create two cropped views of the input image? #6659

Uh oh!

ArjunNarayanan Jun 26, 2023

Replies: 1 comment · 1 reply

Uh oh!

tangy5 Jun 26, 2023 Collaborator

Uh oh!

ArjunNarayanan Jun 26, 2023 Author

ArjunNarayanan
Jun 26, 2023

Replies: 1 comment 1 reply

tangy5
Jun 26, 2023
Collaborator

ArjunNarayanan Jun 26, 2023
Author