[Questions] Re-rank and ablation study

Hello authors of ReflectiVA, thank you for the excellent open-source project!

After reviewing the code and paper, I am a bit confused about the re-rank part.
It seems to me that the released code of evaluation are just following the orginal ReflectiVA pipeline, where all the sections from the initial retrieved top-k entries assigned [REL] token will be used as extra input to answer the KB-VQA questions..

For the re-rank experiments, I believe that the reranker model is from the [EchoSight project ](https://github.com/Go2Heart/EchoSight/tree/main?tab=readme-ov-file#inference) as indicated in the paper. However, I am not sure how you re-ranked the sections.
The score in the native EchoSight pipeline is a weighted sum consisting of two parts, a). initial retrieval similarity score(cosine similarity score of the query image and the image feature of entried in KB calculated via EVA-CLIP-8B) + b). section score assessed by the trained multi-modal reranker. In orginal settings, the weights are 1:1 . I would appreciate it if you could ellaborate more details here. I think you are only using the part b). section score as in your released code **rag_evaluation/encyclopedic/release_retrieval.py** there is no information to the initial retrieval similarity score?(In your implemenation only E-VQA use image-to-image retrieval).

I would appreciate it if you could help to answer this question or release the code for the ablation study.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Questions] Re-rank and ablation study #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Questions] Re-rank and ablation study #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions