Baseline issue

When re-producing the experiment of pretraining on mag240m and evaluating on arxiv, we found that the Contrastive baseline results in the similar performance as Prodigy when the aux loss is applied (using `-attr 1000`) and the training step is increased to 50,010. 

The complete training command we used is

```bash
python run_single_experiment.py --dataset mag240m --root /datasets --original_features True --input_dim 768 --emb_dim 256 -ds_cap 50010 -val_cap 100 -test_cap 100 --epochs 1 -ckpt_step 1000 -layers S2,U,A -lr 5e-4 -way 30 -shot 3 -qry 4 -eval_step 500 -task cls_nm_sb -bs 1 -aug ND0.5,NZ0.5 -aug_test True -attr 1000 --device 0 --prefix MAG_Contrastive
```

The evaluation command is 

```bash
python run_single_experiment.py --dataset arxiv --root /datasets --emb_dim 256 --input_dim 768 -ds_cap 510 -val_cap 510 -test_cap 500 -eval_step 100 -epochs 1 --layers S2,U,A -way 3 -shot 3 -qry 3 -lr 1e-5 -bert roberta-base-nli-stsb-mean-tokens -pretrained state_dict_49000.ckpt --eval_only True --train_cap 10 --device 0
```

The confusing results on the test accuracy over arxiv dataset are

| way | Contrastive | Prodigy |
|-----|-------------|---------|
|   3 |       74.92 | 73.09   |
|   5 |       63.81 | 61.52   |
|  10 |       49.77 | 46.74   |
|  20 |       37.62 | 34.41   |
|  40 |       27.85 | 25.13   |

The checkpoint of the contrastive model we obtained when pretraining on mag240m is attached. Could you please clarify if there is anything wrong in the setting of our experiments? Thank you!


[state_dict_49000.ckpt.zip](https://github.com/user-attachments/files/16656405/state_dict_49000.ckpt.zip)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baseline issue #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Baseline issue #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions