Thanks for your amazing work on the zero-shot captioning task. As shown in Table 1 of this paper, the zerocap's performance on COCO is as follows:
however, it seems different from the performance reported in Zerocap's paper and is shown as follows:
In this paper, did the Zerocap use different settings that resulted in this difference?
I would greatly appreciate your response.
Thank you.