Clarification on Vidore Test Set: Are answer annotations used during training/evaluation?

Hi, I have a quick question about the Vidore test set.

When checking the preprocessing code (e.g., for vidore_arxivqa), the query is the question, and the candidate is constructed as prompt + image. It seems that the answer text of the question is never used in either training or evaluation.

Just want to double-check:
	•	Is it expected that the answer field is not used at all?
	•	So the model only learns from (question → image) relevance, without using the actual answer?

I want to confirm whether this is the intended design. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on Vidore Test Set: Are answer annotations used during training/evaluation? #179

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification on Vidore Test Set: Are answer annotations used during training/evaluation? #179

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions