-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
First of all, thank you for a great job! I ran into a few issues while following the tutorial to reproduce:
I first follow tutorial to emersion lynx ACC on MSCOCO_ITM task, that is, Table18 in the paper. I used the following command:
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 run_eval.py \
--model lynx --model_name models/interfaces/lynx/configs/LYNX.yaml \
--dataset_name MSCOCO --output_dir output/lynx/MSCOCO/test_generation/ \
--per_gpu_eval_batch_size 4 --formulation SingleChoice \
--infer_method generation --do_eval --half_evaluation --dataset_duplication 1 \
--in_context_sample --option_mark upper \
--dataset_config build/configs/ImageTextMatching_val.yaml \
--offline_hf
I used generation as the inference method, but the results I get were rather strange:
2023-11-01 16:00:35,236 ReForm-Eval Evaluation INFO: the evalueted SingleChoice result: 0.0
2023-11-01 16:00:35,236 ReForm-Eval Evaluation INFO: the format hit rate is 0.0
If I use likelihood as the inference method, the results are still different from that in the paper:
2023-11-01 15:39:14,806 ReForm-Eval Evaluation INFO: the evalueted SingleChoice result: 0.5183333333333333
2023-11-01 15:39:14,806 ReForm-Eval Evaluation INFO: the format hit rate is 1.0
I'm at a loss to understand, and I hope you can help to point out where the problem may be.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels