Hi,
Thank you for sharing your work and releasing the code.
While reading the arXiv version, I find a clear error in the reported results: Table 1 and Table 8 contain several rows with exactly the same numbers (e.g., Random / InstructBLIP / VCD, among others), even though the two tables are described as showing results for 7B and 13B models, respectively.
I would suggest correcting the affected table(s) to make sure the performance is correct.