-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
When re-producing the experiment of pretraining on mag240m and evaluating on arxiv, we found that the Contrastive baseline results in the similar performance as Prodigy when the aux loss is applied (using -attr 1000) and the training step is increased to 50,010.
The complete training command we used is
python run_single_experiment.py --dataset mag240m --root /datasets --original_features True --input_dim 768 --emb_dim 256 -ds_cap 50010 -val_cap 100 -test_cap 100 --epochs 1 -ckpt_step 1000 -layers S2,U,A -lr 5e-4 -way 30 -shot 3 -qry 4 -eval_step 500 -task cls_nm_sb -bs 1 -aug ND0.5,NZ0.5 -aug_test True -attr 1000 --device 0 --prefix MAG_ContrastiveThe evaluation command is
python run_single_experiment.py --dataset arxiv --root /datasets --emb_dim 256 --input_dim 768 -ds_cap 510 -val_cap 510 -test_cap 500 -eval_step 100 -epochs 1 --layers S2,U,A -way 3 -shot 3 -qry 3 -lr 1e-5 -bert roberta-base-nli-stsb-mean-tokens -pretrained state_dict_49000.ckpt --eval_only True --train_cap 10 --device 0The confusing results on the test accuracy over arxiv dataset are
| way | Contrastive | Prodigy |
|---|---|---|
| 3 | 74.92 | 73.09 |
| 5 | 63.81 | 61.52 |
| 10 | 49.77 | 46.74 |
| 20 | 37.62 | 34.41 |
| 40 | 27.85 | 25.13 |
The checkpoint of the contrastive model we obtained when pretraining on mag240m is attached. Could you please clarify if there is anything wrong in the setting of our experiments? Thank you!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels