Skip to content

Commit 1fc341a

Browse files
Update XCiT doc (#109)
* Update README.md * Update Classification_Models_Guide.md
1 parent e5d30db commit 1fc341a

File tree

2 files changed

+33
-29
lines changed

2 files changed

+33
-29
lines changed

configs/xcit/README.md

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ Following tremendous success in natural language processing, transformers have r
88

99
## Getting Started
1010

11+
An [AI Studio](https://aistudio.baidu.com/aistudio/index) project about XCiT has been published, and you can click [here](https://aistudio.baidu.com/aistudio/projectdetail/3449604) to open the project and run commands of training and evaluation directly.
12+
1113
#### Train with single gpu
1214
```bash
1315
python tools/train.py -c configs/xcit/${XCIT_ARCH}.yaml
@@ -25,7 +27,7 @@ python tools/train.py -c configs/xcit/${XCIT_ARCH}.yaml --load ${XCIT_WEGHT_FILE
2527

2628
#### Knowledge distillation
2729

28-
For knowledge distillation, you only need to replace `${XCIT_ARCH}.yaml` to corresponding distillation config file, `${XCIT_ARCH}_dist.yaml`, at above commands. We provide pretrained weights of Teacher model `RegNetY_160`, which can be downloaded [here](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/regnety_160.pdparams).
30+
For knowledge distillation, you only need to replace `${XCIT_ARCH}.yaml` to corresponding distillation config file, `${XCIT_ARCH}_dist.yaml`, at above commands. We provide pretrained weights of Teacher model `RegNetY_160`, which can be downloaded [here](https://passl.bj.bcebos.com/vision_transformers/xcit/regnety_160.pdparams).
2931

3032
Checkpoints saved in distillation training include both Teacher's and Student's weights. You can extract the weights of Student by following command.
3133
```bash
@@ -38,20 +40,21 @@ python tools/extract_weight.py ${DISTILLATION_WEIGHTS_FILE} --prefix Student --r
3840
The results are evaluated on ImageNet2012 validation set
3941
| Arch | Weight | Top-1 Acc | Top-5 Acc | Crop ratio | # Params |
4042
| ------------------ | ------------------------------------------------------------ | --------- | --------- | ---------- | -------- |
41-
| xcit_nano_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_nano_12_p8_224.pdparams) | 73.90 | 92.13 | 1.0 | 3.05M |
42-
| xcit_tiny_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_12_p8_224.pdparams) | 79.68 | 95.04 | 1.0 | 6.71M |
43-
| xcit_tiny_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_24_p8_224.pdparams) | 81.87 | 95.97 | 1.0 | 12.11M |
44-
| xcit_small_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_12_p8_224.pdparams) | 83.36 | 96.51 | 1.0 | 26.21M |
45-
| xcit_small_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_24_p8_224.pdparams) | 83.82 | 96.65 | 1.0 | 47.63M |
46-
| xcit_medium_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_medium_24_p8_224.pdparams ) | 83.73 | 96.39 | 1.0 | 84.32M |
47-
| xcit_large_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_large_24_p8_224.pdparams) | 84.42 | 96.65 | 1.0 | 188.93M |
48-
| xcit_nano_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_nano_12_p16_224.pdparams) | 70.01 | 89.82 | 1.0 | 3.05M |
49-
| xcit_tiny_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_12_p16_224.pdparams) | 77.15 | 93.72 | 1.0 | 6.72M |
50-
| xcit_tiny_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_24_p16_224.pdparams) | 79.42 | 94.86 | 1.0 | 12.12M |
51-
| xcit_small_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_12_p16_224.pdparams) | 81.89 | 95.83 | 1.0 | 26.25M |
52-
| xcit_small_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_24_p16_224.pdparams) | 82.51 | 95.97 | 1.0 | 47.67M |
53-
| xcit_medium_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_medium_24_p16_224.pdparams) | 82.67 | 95.91 | 1.0 | 84.40M |
54-
| xcit_large_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_large_24_p16_224.pdparams) | 82.89 | 95.89 | 1.0 | 189.10M |
43+
| xcit_nano_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p8_224.pdparams) | 73.90 | 92.13 | 1.0 | 3.05M |
44+
| xcit_nano_12_p8_224_dist | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p8_224_dist.pdparams) | 77.28 | 93.25 | 1.0 | 3.05M |
45+
| xcit_tiny_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_12_p8_224.pdparams) | 79.68 | 95.04 | 1.0 | 6.71M |
46+
| xcit_tiny_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_24_p8_224.pdparams) | 81.87 | 95.97 | 1.0 | 12.11M |
47+
| xcit_small_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_12_p8_224.pdparams) | 83.36 | 96.51 | 1.0 | 26.21M |
48+
| xcit_small_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_24_p8_224.pdparams) | 83.82 | 96.65 | 1.0 | 47.63M |
49+
| xcit_medium_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_medium_24_p8_224.pdparams ) | 83.73 | 96.39 | 1.0 | 84.32M |
50+
| xcit_large_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_large_24_p8_224.pdparams) | 84.42 | 96.65 | 1.0 | 188.93M |
51+
| xcit_nano_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p16_224.pdparams) | 70.01 | 89.82 | 1.0 | 3.05M |
52+
| xcit_tiny_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_12_p16_224.pdparams) | 77.15 | 93.72 | 1.0 | 6.72M |
53+
| xcit_tiny_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_24_p16_224.pdparams) | 79.42 | 94.86 | 1.0 | 12.12M |
54+
| xcit_small_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_12_p16_224.pdparams) | 81.89 | 95.83 | 1.0 | 26.25M |
55+
| xcit_small_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_24_p16_224.pdparams) | 82.51 | 95.97 | 1.0 | 47.67M |
56+
| xcit_medium_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_medium_24_p16_224.pdparams) | 82.67 | 95.91 | 1.0 | 84.40M |
57+
| xcit_large_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_large_24_p16_224.pdparams) | 82.89 | 95.89 | 1.0 | 189.10M |
5558

5659

5760
## Usage

docs/Classification_Models_Guide.md

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -42,20 +42,21 @@ PASSL provides developers with a number of implementations of Transformer classi
4242
| beit_large_p16_512 | [ft 22k to 1k](https://passl.bj.bcebos.com/vision_transformers/beit/beit_large_p16_512_ft.pdparams) | 88.60 | 98.66 | 1.0 | 304M |
4343
| mlp_mixer_b16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/mlp_mixer/mlp-mixer_b16_224.pdparams) | 76.60 | 92.23 | 0.875 | 60.0M |
4444
| mlp_mixer_l16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/mlp_mixer/mlp-mixer_l16_224.pdparams) | 72.06 | 87.67 | 0.875 | 208.2M |
45-
| xcit_nano_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_nano_12_p8_224.pdparams) | 73.90 | 92.13 | 1.0 | 3.05M |
46-
| xcit_tiny_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_12_p8_224.pdparams) | 79.68 | 95.04 | 1.0 | 6.71M |
47-
| xcit_tiny_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_24_p8_224.pdparams) | 81.87 | 95.97 | 1.0 | 12.11M |
48-
| xcit_small_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_12_p8_224.pdparams) | 83.36 | 96.51 | 1.0 | 26.21M |
49-
| xcit_small_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_24_p8_224.pdparams) | 83.82 | 96.65 | 1.0 | 47.63M |
50-
| xcit_medium_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_medium_24_p8_224.pdparams ) | 83.73 | 96.39 | 1.0 | 84.32M |
51-
| xcit_large_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_large_24_p8_224.pdparams) | 84.42 | 96.65 | 1.0 | 188.93M |
52-
| xcit_nano_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_nano_12_p16_224.pdparams) | 70.01 | 89.82 | 1.0 | 3.05M |
53-
| xcit_tiny_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_12_p16_224.pdparams) | 77.15 | 93.72 | 1.0 | 6.72M |
54-
| xcit_tiny_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_tiny_24_p16_224.pdparams) | 79.42 | 94.86 | 1.0 | 12.12M |
55-
| xcit_small_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_12_p16_224.pdparams) | 81.89 | 95.83 | 1.0 | 26.25M |
56-
| xcit_small_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_small_24_p16_224.pdparams) | 82.51 | 95.97 | 1.0 | 47.67M |
57-
| xcit_medium_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_medium_24_p16_224.pdparams) | 82.67 | 95.91 | 1.0 | 84.40M |
58-
| xcit_large_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/pvt_v2/xcit_large_24_p16_224.pdparams) | 82.89 | 95.89 | 1.0 | 189.10M |
45+
| xcit_nano_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p8_224.pdparams) | 73.90 | 92.13 | 1.0 | 3.05M |
46+
| xcit_nano_12_p8_224_dist | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p8_224_dist.pdparams) | 77.28 | 93.25 | 1.0 | 3.05M |
47+
| xcit_tiny_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_12_p8_224.pdparams) | 79.68 | 95.04 | 1.0 | 6.71M |
48+
| xcit_tiny_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_24_p8_224.pdparams) | 81.87 | 95.97 | 1.0 | 12.11M |
49+
| xcit_small_12_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_12_p8_224.pdparams) | 83.36 | 96.51 | 1.0 | 26.21M |
50+
| xcit_small_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_24_p8_224.pdparams) | 83.82 | 96.65 | 1.0 | 47.63M |
51+
| xcit_medium_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_medium_24_p8_224.pdparams ) | 83.73 | 96.39 | 1.0 | 84.32M |
52+
| xcit_large_24_p8_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_large_24_p8_224.pdparams) | 84.42 | 96.65 | 1.0 | 188.93M |
53+
| xcit_nano_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_nano_12_p16_224.pdparams) | 70.01 | 89.82 | 1.0 | 3.05M |
54+
| xcit_tiny_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_12_p16_224.pdparams) | 77.15 | 93.72 | 1.0 | 6.72M |
55+
| xcit_tiny_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_tiny_24_p16_224.pdparams) | 79.42 | 94.86 | 1.0 | 12.12M |
56+
| xcit_small_12_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_12_p16_224.pdparams) | 81.89 | 95.83 | 1.0 | 26.25M |
57+
| xcit_small_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_small_24_p16_224.pdparams) | 82.51 | 95.97 | 1.0 | 47.67M |
58+
| xcit_medium_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_medium_24_p16_224.pdparams) | 82.67 | 95.91 | 1.0 | 84.40M |
59+
| xcit_large_24_p16_224 | [pretrain 1k](https://passl.bj.bcebos.com/vision_transformers/xcit/xcit_large_24_p16_224.pdparams) | 82.89 | 95.89 | 1.0 | 189.10M |
5960

6061
The above metrics were tested on the ImageNet 2012 dataset.
6162

0 commit comments

Comments
 (0)