Name	Name	Last commit message	Last commit date
parent directory ..
challenge	challenge
commons	commons
configs	configs
scripts	scripts
README.md	README.md
requirements.txt	requirements.txt
test.py	test.py
train.py	train.py

Name

Last commit message

Last commit date

challenge

Human Protein Atlas Image Classification

Overview

Data

Channels: RGB
Oversampling
External data: http://v18.proteinatlas.org

Augmentation

Resize, Rotate, RandomRotate90, HorizontalFlip, RandomBrightnessContrast, Normalize

Model design

Backbone: Resnet50 pretrained on ImageNet
Head: 2 linear layers with batch normalization and dropout

Loss

Binary Cross Entropy Loss

Training

5-fold CV
Optimizer: Adam
Different learning rates for different layers
Head fine-tuning with frozen backbone (1 epoch)
Scheduler: Cyclical Learning Rates

Stage 1:

Image size: 256
Batch size: 128
Epochs: 16

Stage 2:

Image size: 512
Batch size: 32
Epochs: 6

Prediciton

TTA: 8
TTA augmentation: Resize, Rotate, RandomRotate90, HorizontalFlip, Normalize
The mean of the predictions
Threshold: 0.2

Result

Training takes ~35 hours on Tesla v100
Public LB: 0.595
Private LB: 0.523

Observations

Mixed precision works poorly
External data helps a lot
BCE Loss with oversampling is much better than Focal Loss
Resnet50 outperforms Resnet18 and Resnet34
5 folds improve score by 0.024
TTA helps too

Installation

First, clone the repository

git clone https://github.com/rebryk/kaggle.git
cd kaggle/human-protein

Second, install requirements

pip install -r requirements.txt

Third, install apex

git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext

And last but not least, update config files in the configs folder to match your preferences!

External data

Use scripts/external_data.py and scripts/convert_data.py to download and convert external data.

Training

# Stage 1
cp configs/train256.py config.py
python train.py

# Stage 2
cp configs/train512.py config.py
python train.py

Prediction

cp configs/test.py config.py
python test.py

Submissions are saved in the submissions folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Human Protein Atlas Image Classification

Overview

Data

Augmentation

Model design

Loss

Training

Prediciton

Result

Observations

Installation

External data

Training

Prediction

FilesExpand file tree

human-protein

Directory actions

More options

Directory actions

More options

Latest commit

History

human-protein

Folders and files

parent directory

README.md

Human Protein Atlas Image Classification

Overview

Data

Augmentation

Model design

Loss

Training

Prediciton

Result

Observations

Installation

External data

Training

Prediction