[Question] Cannot reproduce LLaVA-v1.5-7b performance on MM-VET

### Question

Hello,

I tried to reproduce the reported results of LLaVA-v1.5-7B on the MM-VET dataset, but the performance I obtained was much lower.

- I loaded the official checkpoint and ran the script `scripts/v1_5/eval/mmvet.sh`. Then I submitted the generated answers to the official evaluation platform at [Huggingface](https://huggingface.co/spaces/whyu/MM-Vet_Evaluator), but only got an accuracy of 27.8.

- I also directly uploaded the provided `llava-v1.5-7b.json` file from the official `eval.zip` package (under `mm-vet/results/`), but the total score was still only 27.9.

This is much lower than the value of **31.1** reported in the paper.

Here I also provide the full evaluation results I obtained:

||rec|ocr|know|gen|spat|math|total|std|runs|
|--|--|--|--|--|--|--|--|--|--|
|llava-v1.5-7b|33.2|17.8|14.4|15.4|22.1|7.7|27.9|0.0|[np.float64(27.9)|

Could you please let me know where the problem might be?

Thanks in advance for the clarification!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Cannot reproduce LLaVA-v1.5-7b performance on MM-VET #1905

Question

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Question] Cannot reproduce LLaVA-v1.5-7b performance on MM-VET #1905

Description

Question

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions