Questions on Training Time and Validation Behavior in PatchCore and PaDiM #2507

Tic-Tac7 · 2025-01-14T10:39:46Z

Tic-Tac7
Jan 14, 2025

Hi everyone,

I am currently using PatchCore and PaDiM on my own dataset and have a couple of questions about their behavior:

Training Time Differences
When training both algorithms, I noticed that PaDiM takes approximately 8 minutes, whereas PatchCore takes around 2 hours and 30 minutes. Is this difference primarily due to the method used for reducing the memory bank? (For example, PaDiM seems to resize the memory bank randomly, whereas PatchCore uses coreset sub-sampling.)
Validation/Test Behavior with Abnormal Data
As I understand it, training relies only on normal images, while both normal and abnormal images are used for validation and testing.
To explore this, I tested PaDiM with two different datamodules, created using the same training dataset but different abnormal datasets:
o Dataset_1: A square of 5 "dead" pixels (black ones) added to the images.
o Dataset_2: A square of 1 "dead" pixel added to the images.
The model trained with Dataset_1 performs better on its respective validation/test set. However, when I tested this model on Dataset_2, its performance was worse than the model trained with Dataset_2.
This seems counterintuitive since the training is done only on normal data, which is identical for both models. Could this behavior be due to differences in thresholds between the models, resulting in the observed performance discrepancies?

Thank you in advance for your insights!

alexriedel1 · 2025-01-14T11:18:29Z

alexriedel1
Jan 14, 2025

Padim uses a fixed number of feature embeddings (on Wide ResNet 50 its 550) for calculating the anomalies, while patchcore uses a proportion of all embeddings seen during training (10%). If you have many images, the number of embeddings in patchcore will be significantly larger than in padim, resulting in higher training and espcially validation times
2.
You are right that the training only happens on normal data. However during validation a threshold on both good and bad images is calculated. That is to know the threshold a raw anomaly score output will be considered a good or bad image. So my guess is that your model validated on dataset 1 and tested on dataset 2 will produce some false negatives (e.g. needs to see more black pixels to consider an image as bad)

0 replies

Tic-Tac7 · 2025-01-15T09:47:44Z

Tic-Tac7
Jan 15, 2025
Author

Thank you for your response !

I now have a better understanding of the models, but I still have a few questions regarding some concepts in PaDiM and PatchCore:

Object Alignment
As I understand it, both PaDiM and PatchCore are based on the SPADE method, which calculates the k-nearest neighbors (kNN) of normal images for a given test image. The test image is then compared to these normal images.
However, I’m struggling to understand how these models achieve accurate anomaly detection when the best-matching normal images may not align perfectly with the test image. For instance, in datasets like the hazelnut dataset from the MVTec AD benchmark, the object positions might vary slightly. In such cases, pixel-to-pixel comparison seems problematic, as the corresponding regions might not overlap perfectly.
I understand these models perform better with aligned objects, and newer approaches like FR-PatchCore aim to address this issue. But how do PaDiM and PatchCore still manage to work reasonably well on non-aligned objects?
Resolution Size
I noticed that ImageNet images are resized to 256×256 pixels in these models. However, my goal is to retain the high resolution of my dataset, so I disabled resizing in the PaDiM and PatchCore modules.
When localizing defects, I’ve observed that the detected anomalies are often shifted from the actual defect locations (in the classification task). Could this issue arise from the "we upscale the result by bi-linear interpolation" step mentioned in the PatchCore paper? Or should I consider applying segmentation to achieve more precise localization ? (I also try using “Tiling” but it gave me worse results than without)
Locally Aware Patch Features
In the paper "Towards Total Recall in Industrial Anomaly Detection", I came across the concept of “locally aware patch features.” I’m not entirely sure I’ve understood this concept correctly. Could someone clarify how this works and how it improves anomaly detection?

Thank you for your time and for sharing your expertise !

5 replies

alexriedel1 Jan 17, 2025

They don't compare pixel wise, they compare features. A feature can be anything from a certain line, to circles, to complex forms. These features are obtained from deep learning models, that have been originally been trained to classify objects (thats where they learn meaningful features) and are only "misused" for this purpose of anomaly detection.
So they ignore alignment to a (small) certain amount. There are pre-trained models that are better or worse in this task. Also to account for different alignments, as you see in the hazelnut dataset, the training data (good hazelnut images) have a variety of different alignments. This helps the algorithm to see enough different alignments of the object.

You can input larger images than 256x256 into patchcore and padim. This will also most likely increase the anomaly detection accuracy while resulting in longer training times and larger models.
How large is that shift you are observing? The anomaly maps that come out of the model are for exmaple 16x16, and then upscaled to fit an 256x256 image. As you can imagine this will lead to some shifts, but in most cases its not too much of a problem.

What they do in patchcore is, they put an image throught as pre-trained ResNet and compute the feature maps after block 2 and 3 (where there is still a lot of spatial information in the features and are not too abstract). Let's say on map has dimensions of 128x128. Because this would be to large to effectively run calculations on, the reduce the size of this map via average pooling. This combines multiple patches of the maps into one, which then they call a "locally aware patch feature"

Tic-Tac7 Jan 22, 2025
Author

Thank your response.

It’s clearer now, but I think I have some difficulties to really understand what are the features extracted. As after a certain number of convolutions, it becomes more abstract since the resolution decreases, It’s hard to imagine what it looks like and how the model interpret it and use it as a "space localisation". But thank you, I got the global idea !

 I cannot share pictures, but I would say that the difference between the real localisation and the one found by the model is roughly about 50 pixels (my image is 600x700).

If you are interested, I tested three different ‘architectures’ on the same dataset :
a. Without any ‘transform’ applied on the datamodule and just having normalization => Perfect matching between real defect the predicted one ;
b. Applying a center_crop transform => Shift between the real and predicted defect;
c. Without any ‘transform’ applied on the datamodule and just having normalization + Tiling (25 patches) => Even bigger shift than case b.
So I don’t know why but when I do not apply a transform on the data, the predicted localisation is perfect but when adding ‘center_crop’ or a Tiled ensemble method, a shift appears.

Okay, so it applies an average pooling not on “pixels” like usual but on ‘patches’ directly.

alexriedel1 Jan 22, 2025

Hey!
Ok, maybe you can read about CNNs in general and feature maps in particular to make the concept clearer! They look like this for different depths inside the model https://medium.com/@saba99/feature-map-35ba7e6c689e

About the transformation: How do you apply these transforms? Can you share your code? Do you use the correct API as provided by anomalib to apply center crop?

Tic-Tac7 Jan 24, 2025
Author

First, I deactivated the Resize function in the configure_transforms method within the lightning_model of PaDiM and PatchCore and replaced the normalization values coming from the ImageNet with my own data values (mean and std). So this corresponds to case a).

For the code :
datamodule = Folder(
name = name,
root = dataset_root,
normal_dir = “normal”,
task = “classification”,
seed = 42,
train_batch_size = 8,
eval_batch_size = 8,
transform = CenterCrop(532) # only for case b)
)

datamodule.setup()

#1 Get the model
model = Padim()

2 Tiling – For case c)

Tiler_config_callback = TilerConfigureCallback(enable=True, tile_size=[200, 138], stride=[100, 119]) # --> 25 models

#3 Instantiate the model
engine = Engine(
task = TaskType.CLASSIFICATION
image_metrics = [‘AUROC’, ‘F1Score’],
callbacks = [tiler_config_callback] # only for case c)
)

Fit

Engine.fit(model=model, datamodule=datamodule)

Test

Engine.test(model=model, datamodule=datamodule)

Here is an example of the shift (it is on another dataset, but resulting in the same behaviour)

alexriedel1 Jan 26, 2025

It looks like you are not using anomlib version 2 but an older version. You should check if you still get the issue in anomalib v2.

https://github.com/openvinotoolkit/anomalib?tab=readme-ov-file#-beta-version-v200b2

You will have to change your training pipeline in version 2 . Follow the How To guide for this: https://anomalib.readthedocs.io/en/v2.0.0-beta.1/markdown/guides/how_to/index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions on Training Time and Validation Behavior in PatchCore and PaDiM #2507

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Questions on Training Time and Validation Behavior in PatchCore and PaDiM #2507

Uh oh!

Tic-Tac7 Jan 14, 2025

Replies: 2 comments · 5 replies

Uh oh!

alexriedel1 Jan 14, 2025

Uh oh!

Tic-Tac7 Jan 15, 2025 Author

Uh oh!

alexriedel1 Jan 17, 2025

Uh oh!

Tic-Tac7 Jan 22, 2025 Author

Uh oh!

alexriedel1 Jan 22, 2025

Uh oh!

Tic-Tac7 Jan 24, 2025 Author

2 Tiling – For case c)

Fit

Test

Uh oh!

alexriedel1 Jan 26, 2025

Tic-Tac7
Jan 14, 2025

Replies: 2 comments 5 replies

alexriedel1
Jan 14, 2025

Tic-Tac7
Jan 15, 2025
Author

Tic-Tac7 Jan 22, 2025
Author

Tic-Tac7 Jan 24, 2025
Author