kaggle-vinbigdata-chest-xray-abnormalities-detection

Code for 76th place solution in Kaggle VinBigData Chest X-ray Abnormalities Detection.

Code Overview
Solution Summary
Read Solution Summary in other languages: English, 한국어

Code Overview

Prepare dataset

My solution use 1024px resized dataset. And you have two options.

Download original dataset (~192 GB) and create resized dataset.
Or download kaggle public dataset already made (3.59 GB).

Option 1. - Download original dataset and create 1024px resized dataset

Download Original Competition data on the Kaggle competition page. And extract zip file.

kaggle competitions download -c vinbigdata-chest-xray-abnormalities-detection
unzip vinbigdata-chest-xray-abnormalities-detection.zip -d dataset

Create 1024px resized dataset. It will be saved to "dataset-jpg" folder. It takes some hours.

python prepare_data.py --dataset_dir=dataset

Option 2. - Download kaggle public dataset already made

Alternatively, you can use kaggle public dataset - VinBigData 1024 JPG Dataset

kaggle datasets download -d sunghyunjun/vinbigdata-1024-jpg-dataset
unzip vinbigdata-1024-jpg-dataset.zip -d dataset-jpg

Train

Train classifier

python train.py \
--mode=classification \
--model_name="tf_efficientnet_b5" \
--default_root_dir=./checkpoint_path \
--gpus=1 \
--clf_image_size=600 \
--batch_size=12 \
--num_workers=4 \
--fold_splits=5 \
--fold_index=0 \
--max_epochs=50 \
--init_lr=0.75e-4 \
--weight_decay=1e-4 \
--precision=16 \

Train detector for 14-class

python train.py \
--mode=detection \
--model_name="tf_efficientdet_d5" \
--default_root_dir=./checkpoint_path \
--gpus=1 \
--detector_image_size=896 \
--batch_size=3 \
--num_workers=2 \
--fold_splits=5 \
--fold_index=0 \
--max_epochs=50 \
--init_lr=3e-4 \
--weight_decay=1e-3 \
--detector_bbox_filter=nms_v2 \
--detector_valid_bbox_filter \
--progress_bar_refresh_rate=30 \
--precision=16 \

Train detector for 15-class

python train.py \
--mode=detection_all \
--model_name="tf_efficientdet_d4" \
--default_root_dir=./checkpoint_path \
--gpus=1 \
--detector_image_size=896 \
--batch_size=4 \
--num_workers=2 \
--fold_splits=5 \
--fold_index=0 \
--max_epochs=50 \
--init_lr=4e-4 \
--weight_decay=1e-3 \
--detector_bbox_filter=nms_v2 \
--detector_valid_bbox_filter \
--precision=16 \

You can use Pytorch-Lightning Trainer's flags.

--resume_from_checkpoint=./checkpoint_path/saved_checkpoint.ckpt

Also you can use neptune logger.

--neptune_logger \
--neptune_project="your_project_name" \
--experiment_name="your_exp_name" \

And there are some flags for experiments.

--debug : Debug mode. run script for short iteration.

--aspect_ratios_expand : Set config.aspect_ratios of efficientdet as follows.

config.aspect_ratios = [
    (1.0, 1.0),
    (1.4, 0.7),
    (0.7, 1.4),
    (1.8, 0.6),
    (0.6, 1.8),
]

--clf_dataset : Select dataset for classifier, "vbd" or "concat". If you choose "concat" you have to set --dataset_nih_dir flag.

--dataset_nih_dir : Path of NIH dataset. You can download as follows. NIH Chest X-rays 600 JPG Dataset

kaggle datasets download -d sunghyunjun/nih-chest-xrays-600-jpg-dataset
unzip -q nih-chest-xrays-600-jpg-dataset.zip -d dataset-nih

--evaluator_alt : Use addtional evaluator of ZFTurbo's mAP. ZFTurbo/Mean-Average-Precision-for-Boxes

--freeze_batch_norm

--group_norm

--accumulate_grad_batches

--downconv : Add downconv module. Using downconv, efficientdet's input image size is equal to the half of --detector_image_size.

If you need to use full-size jpg dataset, you may check kaggle's public dataset. VinBigData Original Image Dataset

If you experiment downconv, run as follows.

kaggle datasets download -d awsaf49/vinbigdata-original-image-dataset
unzip -q vinbigdata-original-image-dataset.zip -d dataset-jpg

mv dataset-jpg/vinbigdata/* dataset-jpg
rmdir dataset-jpg/vinbigdata

python train.py \
--mode=detection \
--model_name="tf_efficientdet_d1" \
--default_root_dir=./checkpoint_path \
--gpus=1 \
--detector_image_size=1280 \
--downconv \
--batch_size=4 \
--num_workers=2 \
--fold_splits=5 \
--fold_index=0 \
--max_epochs=50 \
--init_lr=4e-4 \
--weight_decay=1e-3 \
--detector_bbox_filter=nms_v2 \
--detector_valid_bbox_filter \
--precision=16 \

Submission Notebook

91th place VBD inference CLF DET and DET ALL

Solution Summary

The ensembled results of the two-stage approach and the one-stage approach. The two-stage approach consists of 2-class classifier and 14-class detector. And the one-stage approach is 15-class detector.

Tools

Colab Pro, Tesla V100 16GB single GPU
GCS
Pytorch Lightning
Neptune
Kaggle API

Validation

StratifiedKFold was used and the data set was composed of 5 folds.

The classifier of the two-stage approach was trained using all normal and abnormal images, and only abnormal images were used for the detector.

The one-stage detector was trained using all images of normal and abnormal.

Fusing BBoxes

The overlapped bboxes in both train set and validation set were fused using nms. I tested it using batched_nms of torchvision, nms and wbf of ZFTurbo.

14-class Efficientdet d4 896px 30 epochs without classifier, local cv on positive image only

	cv(mAP@iou=0.4)	public LB	private LB
torchvision batched_nms	0.4317	0.155	0.168
ZFTurbo nms	0.4419	0.164	0.181
ZFTurbo wbf	0.4158	0.157	0.185

I thought direct comparison of local cv was not possible, and LB was also somewhat difficult to use as a criterion for judgement.

After considering the labeling method of the test set, I thought that the nms method was more similar to the labeling method, so I decided to use nms.

[Discussion] nms > weighted box fusion?

torchvision's nms vs ZFTurbo's nms

Torchvision'nms has some problems in the order of results when the bbox scores are the same.

[Stackoverflow] Pytorch argsort ordered, with duplicate elements in the tensor

torchvision.ops.nms

If multiple boxes have the exact same score and satisfy the IoU criterion with respect to a reference box, the selected box is not guaranteed to be the same between CPU and GPU. This is similar to the behavior of argsort in PyTorch when repeated values are present.

It doesn't matter when you convert the label only once at first and then save and load it as csv, pickle, etc., but if you convert the label and use it in a new environment, the consistency of the bbox may not be maintained.

I decided to use ZFTurbo's nms which uses numpy.argsort, as this part would interfere with experiment flexibility and consistency during training.

[GitHub] ZFTurbo/Weighted boxes fusion

Model training

All models were trained on Colab Pro's V100 16GB single GPU.

AdamW
CosineAnnealingLR
epochs = 50
checkpoint selection : max mAP among top-3 min val_loss

Two-stage approach

2-class classifier : EfficientNet, Resnet200d, total 15 models

model	image size(px)	folds	batch size	init lr	weight decay	val_acc	auc
b5	600	5 of 5	12	7.5e-5	1.0e-4	0.9601	0.9930
b6	528	5 of 5	12	7.5e-5	1.0e-3	0.9553	0.9927
resnet200d	600	3 of 5	12	7.5e-5	1.0e-4	0.9541	0.9934
b5	456	single	16	1.0e-4	1.0e-4	0.9557	0.9927
b5	1024	single	4	2.5e-5	1.0e-4	0.9577	0.9936

14-class detector : EfficientDet with 2-class classifier, total 18 models, local cv on positive image only

model	image size(px)	folds	batch size	init lr	weight decay	cv(mAP@iou=0.4)	public LB	private LB
d3	1024	single	3	3e-4	1e-3	0.4545	0.209	0.250
d4	896	5 of 5	4	4e-4	1e-4	0.4541	0.218	0.250
d4	896	single	4	4e-4	1e-3	0.4606	0.257	0.247
d4	1024	single	3	3e-4	1e-3	0.4545	0.228	0.249
d5	768	5 of 5	4	4e-4	1e-3	0.4472	0.225	0.253
d5	896	4 of 5	3	3e-4	1e-3	0.4522	0.214	0.250
d5	1024	single	2	2e-4	1e-3	0.4462	0.214	0.232

One-stage approach

15-class detector, total 2 models, local cv on positive image only

model	image size(px)	folds	batch size	init lr	weight decay	cv(mAP@iou=0.4)	public LB	private LB
d4	896	2 of 5	4	4e-4	1e-3	0.4546	0.230	0.246

At batch size < 4, the mAP result was poor. The larger the image size, the better the mAP, but no further training was possible.

I tried Freeze BatchNorm, accumulate grad batches, and GroupNorm but I didn't get any better results.

I also tested Downconv, but the AP of small objects like Calcification increased, but the AP of ILD and large objects decreased, so the overall mAP was not improved.

[RANZCR CLiP] 11th Place Solution - Utilizing High resolution, Annotations, and Unlabeled data

Augmentation

Resize, scale, and crop were configured by referring to the method in the paper of EfficientDet.

CLAHE, equalize, invertimg, huesaturationvalue, randomgamma, shiftscalerotate did not work.

A.Compose(
[
    A.Resize(height=self.resize_height, width=self.resize_width),
    A.RandomScale(scale_limit=(-0.9, 1.0), p=1.0),
    A.PadIfNeeded(
        min_height=self.resize_height,
        min_width=self.resize_width,
        border_mode=cv2.BORDER_CONSTANT,
        value=0,
        p=1.0,
    ),
    A.RandomCrop(height=self.resize_height, width=self.resize_width, p=1.0),
    A.RandomBrightnessContrast(p=0.8),
    A.ChannelDropout(p=0.5),
    A.OneOf(
        [
            A.MotionBlur(p=0.5),
            A.MedianBlur(p=0.5),
            A.GaussianBlur(p=0.5),
            A.GaussNoise(p=0.5),
        ],
        p=0.5,
    ),
    A.HorizontalFlip(p=0.5),
    A.Normalize(),
    ToTensorV2(),
],

Post processing

In the case of the two-stage approach, it was difficult to find an appropriate threshold for the classifier. The threshold was determined to be a 60~70% normal case. And if it was normal, all detections of the detector were excluded.

In the case of the one-stage approach, similarly, detections were excluded if it was normal based on the threshold.

Blending

The results of the two-stage approach and the one-stage approach were blended using nms, and the final result is as follows.

two-stage approach, 15 classifier + 18 detector

normal thr	public LB	private LB
0.70	0.219	0.246	final submission
0.65	0.217	0.255
0.60	0.215	0.256

one-stage approach, 2 detector

normal thr	public LB	private LB
0.10	0.198	0.245
0.00	0.198	0.245

two-stage + one-stage

thr	public LB	private LB
0.70-0.30	0.224	0.253	final submission
0.65-0.30	0.222	0.259
0.60-0.30	0.220	0.258

What did not work

Freeze BatchNorm

accumulate grad batches

GroupNorm, group per channel = 8

GroupNorm, pretrained backbone with GroupNorm

Downconv

NIH dataset concat classifier

change aspect ratio

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.gitignore		.gitignore
README.ko.md		README.ko.md
README.md		README.md
datamodule.py		datamodule.py
dataset.py		dataset.py
evaluator.py		evaluator.py
models.py		models.py
predict.py		predict.py
prepare_data.py		prepare_data.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

kaggle-vinbigdata-chest-xray-abnormalities-detection

Code Overview

Prepare dataset

Option 1. - Download original dataset and create 1024px resized dataset

Option 2. - Download kaggle public dataset already made

Train

Train classifier

Train detector for 14-class

Train detector for 15-class

Submission Notebook

Solution Summary

Tools

Validation

Fusing BBoxes

torchvision's nms vs ZFTurbo's nms

Model training

Two-stage approach

One-stage approach

Augmentation

Post processing

Blending

two-stage approach, 15 classifier + 18 detector

one-stage approach, 2 detector

two-stage + one-stage

What did not work

About

Uh oh!

Releases

Packages

Uh oh!

Languages

sunghyunjun/kaggle-vinbigdata-chest-xray-abnormalities-detection

Folders and files

Latest commit

History

Repository files navigation

kaggle-vinbigdata-chest-xray-abnormalities-detection

Code Overview

Prepare dataset

Option 1. - Download original dataset and create 1024px resized dataset

Option 2. - Download kaggle public dataset already made

Train

Train classifier

Train detector for 14-class

Train detector for 15-class

Submission Notebook

Solution Summary

Tools

Validation

Fusing BBoxes

torchvision's nms vs ZFTurbo's nms

Model training

Two-stage approach

One-stage approach

Augmentation

Post processing

Blending

two-stage approach, 15 classifier + 18 detector

one-stage approach, 2 detector

two-stage + one-stage

What did not work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages