@@ -18,22 +18,63 @@ Trained weights of 83M_1x8_384: [here]().
1818Pretrain:
1919
2020``` shell
21- torchrun --standalone --nproc_per_node=8 \
22- main_finetune .py \
21+ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --standalone --nproc_per_node=8 \
22+ main_pretrain .py \
2323 --batch_size 256 \
24- --blr 6e -4 \
25- --warmup_epochs 5 \
24+ --blr 1.5e -4 \
25+ --warmup_epochs 20 \
2626 --epochs 200 \
27- --model Efficient_Spiking_Transformer_s \
28- --data_path /your/data/path \
29- --output_dir outputs/T1 \
30- --log_dir outputs/T1 \
31- --model_mode ms \
32- --dist_eval
27+ --model spikmae_12_512 \
28+ --mask_ratio 0.50 \
29+ --data_path ../imagenet1-k \
3330```
3431
3532Finetune:
3633
34+ ``` shell
35+ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --standalone --nproc_per_node=8 \
36+ main_finetune.py \
37+ --batch_size 128 \
38+ --blr 6e-4 \
39+ --warmup_epochs 10 \
40+ --layer_decay 0.75 \
41+ --finetune ../pretrin_checkpoint.pth\
42+ --epochs 150 \
43+ --drop_path 0.1 \
44+ --model spikformer_12_768 \
45+ --data_path ../imagenet1-k \
46+ --output_dir ../outputs/test \
47+ --log_dir ../outputs/test \
48+ --reprob 0.25 \
49+ --mixup 0.8 \
50+ --cutmix 1.0 \
51+ --dist_eval
52+ ```
53+
54+ Distillation:
55+ ``` shell
56+ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 nohup torchrun --standalone --nproc_per_node=8 \
57+ main_finetune.py \
58+ --batch_size 196 \
59+ --blr 1e-3 \
60+ --warmup_epochs 5 \
61+ --epochs 100 \
62+ --drop_path 0.1 \
63+ --finetune finetune_checkpoint.pth \
64+ --model spikformer12_512 \
65+ --data_path ../imagenet1-k \
66+ --output_dir ./outputs/.. \
67+ --log_dir ./outputs/.. \
68+ --dist_eval \
69+ --time_steps 1 \
70+ --kd \
71+ --input_size 224 \
72+ --teacher_model caformer_b36_in21ft1k \
73+ --reprob 0.25 \
74+ --mixup 0.5 \
75+ --cutmix 1.0 \
76+ --distillation_type hard
77+ ```
3778
3879
3980### Data Prepare
0 commit comments