A unified comparison framework for zero and few-shot industrial image anomaly detection, enabling systematic evaluation of state-of-the-art models across multiple industrial datasets. This project implements and compares two leading approaches: AnomalyDINO (few-shot anomaly detection via large-scale foundation models) and MuSc (Multi-Scale Contrastive Learning), providing researchers and practitioners with a standardized benchmark for assessing performance in resource-constrained industrial scenarios where labeled anomaly data is scarce or unavailable.
- Multiple Model Support: Implementations of AnomalyDINO and MuSc anomaly detection models
- Multiple Dataset Support: MVTec AD, MVTec LOCO AD, BTAD, and ViSA datasets
- Flexible Configuration: Hydra-based configuration system for easy experimentation
- MLflow Integration: Comprehensive experiment tracking and model management
- Various Backbones: Support for DINOv2, CLIP, and other vision transformer backbones
- Few-Shot Learning: Configurable few-shot learning scenarios (0, 1, 2, 4, 8, 16, full shots)
- Comprehensive Metrics: Detailed evaluation metrics for different datasets
- Visualization Tools: Built-in visualization utilities for results analysis
- Python 3.10+
- PyTorch with CUDA support
- FAISS (for efficient similarity search)
- MLflow (for experiment tracking)
- Hydra (for configuration management)
git clone https://github.com/your-username/industrial-image-anomaly-detection.git
cd industrial-image-anomaly-detectionconda env update --prefix ./.conda --file environment.yaml --prune
conda activate ./.condaDownload the required datasets:
- MVTec AD: Download from MVTec AD Website
- MVTec LOCO AD: Download from MVTec LOCO AD Website
- BTAD: Download from BTAD Repository
- ViSA: Download from ViSA Website
Update the dataset paths in the configuration files under conf/dataset/.
Run the main script with default configuration:
python main.pyYou can override any configuration parameter:
# Change model and dataset
python main.py model=musc dataset=mvtec_ad
# Modify few-shot settings
python main.py shots=4 seed=42
# Enable/disable MLflow tracking
python main.py mlflow_enable=falseanomalydino: AnomalyDINO model with DINOv2 backbonemusc: MuSc model with CLIP backbone
mvtec_ad: MVTec Anomaly Detection datasetmvtec_loco_ad: MVTec LOCO AD dataset (logical and structural)btad: BTAD datasetvisa: ViSA dataset
shots: Number of reference images (0, 1, 2, 4, 8, 16, or "full")seed: Random seed for reproducibilitysampler_type: Sampling strategy ("musc" for random, "anomalydino" for sequence)
| Dataset | Categories | Image Types | Anomaly Types |
|---|---|---|---|
| MVTec AD | 15 categories | Industrial objects/textures | Defects, damages |
| MVTec LOCO AD | 5 categories | Industrial objects | Logical/structural anomalies |
| BTAD | 3 categories | Industrial products | Surface defects |
| ViSA | 12 categories | Industrial objects | Various anomalies |
- Backbone: DINOv2 Vision Transformer
- Method: Feature extraction + k-NN similarity search
- Backbone: CLIP Vision Transformer
- Method: Multi-scale feature extraction with contrastive learning
- Components: LNAMD, MSM, RsCIN, MSM+
βββ conf/ # Hydra configuration files
β βββ config.yaml # Main configuration
β βββ dataset/ # Dataset configurations
β βββ model/ # Model configurations
βββ datasets/ # Dataset implementations
βββ metrics/ # Metrics implementations for each dataset
βββ models/ # Model implementations
β βββ anomalydino/ # AnomalyDINO implementation
β βββ musc/ # MuSc implementation
β βββ backbone/ # Backbone implementations
|ββ notebooks/ # Jupyter notebooks for exploration and image creation
βββ utils/ # Utility functions
βββ main.py # Main training/evaluation script
The project integrates with MLflow for comprehensive experiment tracking:
- Start MLflow server:
mlflow server --host 0.0.0.0 --port 5000-
Access MLflow UI: Open http://localhost:5000 in your browser
-
Configuration: Enable/disable MLflow tracking in
conf/config.yaml:
mlflow_enable: true
mlflow_run_name: "experiment_name"The project includes visualization tools for:
- Sample images with anomaly masks
- Model predictions vs ground truth
- Feature maps and attention visualizations
- Quantitative results plots
Enable visualization in the configuration:
visualize: true
num_samples: 5The project includes multiple metrics implementations for each supported dataset or model in the metrics/ directory.
Each metrics file implements a compute_metrics() function that takes ground truth and prediction arrays and returns comprehensive evaluation metrics for both image-level and pixel-level anomaly detection performance.
- Create a new dataset class in
datasets/ - Implement the required dataset interface
- Add configuration file in
conf/dataset/ - Create corresponding metrics implementation in
metrics/
- Create a metrics file in
metrics/following the patternmetrics_<dataset_name>.py - Implement the
compute_metrics(gt_sp, pr_sp, gt_px, pr_px)function - Include metrics like: AUROC, F1-Max, AP (image-level) and AUROC, F1-Max, AUPRO (pixel-level)
- Reference existing implementations:
metrics/anomalydino.py,metrics/musc.py,metrics/mvtec_ad.py
- Create a new model class in
models/your_model/ - Implement the required interface methods
- Add configuration file in
conf/model/ - Update the main script imports
- Create a new backbone class in
models/backbone/inheriting fromBaseBackbone - Implement the required abstract methods:
load_pretrained_model(): Load the pretrained weights for the backboneextract_features(images): Extract features from input images
- Consider creating model-specific variants (e.g.,
YourBackboneMuSc,YourBackboneAnomalyDINO) - Register the backbone in
backbone_factory.pyin the appropriate factory functions - Update model configurations to use the new backbone
- Test compatibility with existing models (AnomalyDINO, MuSc)
This project is licensed under the MIT License - see the LICENSE file for details.
This project incorporates code from several research works:
- AnomalyDINO - Licensed under Apache 2.0
- MuSc - Licensed under MIT
- DINOv2 - Licensed under Apache 2.0
- OpenCLIP - Various licenses
- MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
- MVTec LOCO AD: A Dataset for Logical Constraints in Anomaly Detection
- BTAD: A Benchmark for Industrial Anomaly Detection
- ViSA: A Large-Scale Dataset for Visual Anomaly Detection
- AnomalyDINO: Few-Shot Anomaly Detection via Large-Scale Foundation Models
- MuSc: Multi-Scale Contrastive Learning for Industrial Anomaly Detection