Skip to content

Latest commit

 

History

History
102 lines (83 loc) · 2.52 KB

File metadata and controls

102 lines (83 loc) · 2.52 KB

Public LB Private LB Place Silver medal

Overview

Data

Augmentation

  • Resize, Rotate, RandomRotate90, HorizontalFlip, RandomBrightnessContrast, Normalize

Model design

  • Backbone: Resnet50 pretrained on ImageNet
  • Head: 2 linear layers with batch normalization and dropout

Loss

  • Binary Cross Entropy Loss

Training

  • 5-fold CV
  • Optimizer: Adam
  • Different learning rates for different layers
  • Head fine-tuning with frozen backbone (1 epoch)
  • Scheduler: Cyclical Learning Rates

Stage 1:

  • Image size: 256
  • Batch size: 128
  • Epochs: 16

Stage 2:

  • Image size: 512
  • Batch size: 32
  • Epochs: 6

Prediciton

  • TTA: 8
  • TTA augmentation: Resize, Rotate, RandomRotate90, HorizontalFlip, Normalize
  • The mean of the predictions
  • Threshold: 0.2

Result

  • Training takes ~35 hours on Tesla v100
  • Public LB: 0.595
  • Private LB: 0.523

Observations

  • Mixed precision works poorly
  • External data helps a lot
  • BCE Loss with oversampling is much better than Focal Loss
  • Resnet50 outperforms Resnet18 and Resnet34
  • 5 folds improve score by 0.024
  • TTA helps too

Installation

First, clone the repository

git clone https://github.com/rebryk/kaggle.git
cd kaggle/human-protein

Second, install requirements

pip install -r requirements.txt

Third, install apex

git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext

And last but not least, update config files in the configs folder to match your preferences!

External data

Use scripts/external_data.py and scripts/convert_data.py to download and convert external data.

Training

# Stage 1
cp configs/train256.py config.py
python train.py

# Stage 2
cp configs/train512.py config.py
python train.py

Prediction

cp configs/test.py config.py
python test.py

Submissions are saved in the submissions folder.