Skip to content

deeplearning-LI/CVGSSL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CVGSSL

The paper "CLIP-Vision Guided Few-Shot Metal Surface Defect Recognition" has been published in IEEE Transactions on Industrial Informatics.

🧠 CVGSSL Project

Metal surface defect recognition (MSDR) based on deep learning encounters the challenge of Few-Shot expert-labeled data. In this study, we proposed a CLIP-Vision Guided Self Supervised Learning (CVGSSL) framework for representation learning of unlabeled data, completing MSDR using Few-Shot labeled data. This framework initially generates rich and diverse representation information through multiple CLIP-Vs to ensure effective SSL pre-training, followed by the design of an MLP-Adapter to distill knowledge and adapt these representations to recognition tasks. Additionally, we constructed a self-constrained loss to address the inherent problem of intra-class and inter-class distance ambiguity that causes the representation to fall into an equivocal decision margin. Following label-free pre-training of CVGSSL, the downstream model adapts to 1-shot to 4-shot defect recognition tasks through fine-tuning.

🚀 Getting Started

⚠️ Note: Please manually download the CLIP checkpoint (e.g., RN50.pt) and place it in the ckp/ folder. Also, create necessary folders like weight/ and results/ beforehand, as they are not auto-generated.

1. Install Dependencies

pip install torch torchvision timm numpy
pip install git+https://github.com/openai/CLIP.git

2. Run the Training Script

python train.py 

3. Run the Finetune Script

python finetune_lincls.py

📊 Logging & Output

Training logs are automatically saved under logs/. Each run logs per-epoch loss and accuracy for training and testing phases:

[Run 1] Epoch [1/100] Train Loss: 1.2593, Acc: 63.42% | Val Loss: 0.9341, Acc: 78.01%
...
[Run 1] Best Val Acc: 81.53%

📂 Dataset Format

Expected directory structure:

/data_root/
├── train/
│   ├── class1/
│   │   ├── img1.jpg
│   │   └── ...
│   ├── class2/
│   └── ...
└── test/
    ├── class1/
    ├── class2/
    └── ...

✅ Features

  • Support for ViT and ResNet backbones
  • Easy integration of pretrained models (e.g., CLIP, MoCo)
  • Few-shot training and evaluation
  • Separate training modes: linear probing and full finetuning
  • Configurable optimizer, learning rate, and more

📚 Dataset Sources

The datasets used in this project are derived from well-established benchmarks and studies in the field of industrial surface defect detection:

  1. A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects

  2. X-SDD: A new benchmark for hot rolled steel strip surface defects detection

  3. Deep metallic surface defect detection: The new benchmark and detection network

📜 Acknowledgements

This repository is inspired by and partially built upon:

About

CVGSSL implementation code

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages