Some question about the EfficientAD PDN structure #1541
-
Hello, I have some questions about the EfficientAD PDN structure: in my opinion, if the input is 3256256, the output is 384×64×64; Then how to represent the 33×33×3 Input Patch in PDN? I see the code is just a simple nn.Conv2d operation. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Hello, please take a look at the Figure 2 in the EfficientAD paper. 33×33×3 Input Patches become 1×1×384 Patch Descriptors, so in the output you have results for many overlapping patches. In the code it is represented by |
Beta Was this translation helpful? Give feedback.
-
Thank you very much for your reply. I understand what you mean. I have also seen the content of Figure 2 in the paper. But I see in the code that the input size of his image in PDN should be 256x256, how to change to 33x33, I only see nn.Conv2d and nn.AvgPool2d in the code, and do not see other operations. |
Beta Was this translation helpful? Give feedback.
-
Hello. The 33x33 patches are sort of implicit due to the way convolution works. You can check this page for more info on receptive field of CNNs. |
Beta Was this translation helpful? Give feedback.
Hello, please take a look at the Figure 2 in the EfficientAD paper. 33×33×3 Input Patches become 1×1×384 Patch Descriptors, so in the output you have results for many overlapping patches. In the code it is represented by
nn.Conv2d
followed bynn.AvgPool2d
.