After training with augmentation, inference of same image multiple times gives different result #1891
-
I've been doing some experiments with training a PaDiM model on the hazelnut MVTech dataset, until now I have trained without using any augmentations. I now wanted to learn and try how to use an augmented dataset. I've tried to add the same augmentations as in the original paper. PaDiM_inference.mp4This is my training script:
The model has been converted to PyTorch with the following script:
This is the script used for inference:
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
@samet-akcay Do you have any comment on this? |
Beta Was this translation helpful? Give feedback.
-
@djdameln FYI |
Beta Was this translation helpful? Give feedback.
-
Hi, thanks for pointing this out. The short answer is: You need to pass an transforms.Compose([
transforms.Resize(size=(292,292)),
transforms.CenterCrop(size=(256,256))
]) Note that it would be better to also normalize the input images (both train and eval) to ImageNet statistics, because Padim's pre-trained backbone was trained on ImageNet: transform.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) When only the The motivation for re-using the train transforms during inference is that there is no other way of ensuring that the inference runs with the same input shape and normalization statistics as the training. The alternatives, not applying any transforms, or applying the default model-specific transforms, would both break inference due to input shape mismatch. These modules were changed very recently and we are still working on the documentation for this. Reading your post, I realize that the transform behaviour could be confusing to users, so we'll discuss internally if any changes are needed. |
Beta Was this translation helpful? Give feedback.
Hi, thanks for pointing this out.
The short answer is: You need to pass an
eval_transform
to the datamodule before training, which defines how the images should be transformed during inference. In line with the training transforms in your example, the following would be sufficient:Note that it would be better to also normalize the input images (both train and eval) to ImageNet statistics, because Padim's pre-trained backbone was trained on ImageNet:
When only the
train_transform
is specified, Anomalib will re-…