CoAT:Co-Attention Based Transformer for Fine-Grained Visual Classification

Dependencies:

Python 3.7.3
PyTorch 1.5.1
torchvision 0.6.1
ml_collections

Usage

1. Download Google pre-trained ViT models

Get models in this link: ViT-B_16, ViT-B_32...

wget https://storage.googleapis.com/vit_models/imagenet21k/{MODEL_NAME}.npz

2. Prepare data

In the paper, we use data from 5 publicly available datasets:

Please download them from the official websites and put them in the corresponding folders.

3. Install required packages

Install dependencies with the following command:

pip3 install -r requirements.txt

4. Train

To train CoAT on CUB-200-2011 dataset with 4 gpus in FP-16 mode for 10000 steps run:

CUDA_VISIBLE_DEVICES=0,1,2,3 nohup python3 -m torch.distributed.launch --nproc_per_node=4 train_coattention.py --dataset CUB_200_2011 --split overlap --num_steps 10000 --data_root  --pretrained_dir ViT-B_16.npz --fp16 --name sample_run  > test.log &

### Acknowledgement

Many thanks to [ViT-pytorch](https://github.com/jeonsworld/ViT-pytorch) for the PyTorch reimplementation of [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
1		1
apex		apex
data		data
logs/sample_run		logs/sample_run
models		models
pyprof		pyprof
reparameterization		reparameterization
utils		utils
.gitattributes		.gitattributes
README.md		README.md
configs.py		configs.py
network.py		network.py
requirements.txt		requirements.txt
train_coattention.py		train_coattention.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoAT:Co-Attention Based Transformer for Fine-Grained Visual Classification

Dependencies:

Usage

1. Download Google pre-trained ViT models

2. Prepare data

3. Install required packages

4. Train

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CoAT:Co-Attention Based Transformer for Fine-Grained Visual Classification

Dependencies:

Usage

1. Download Google pre-trained ViT models

2. Prepare data

3. Install required packages

4. Train

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages