Hi, thanks for the great work!
While reading the paper / repo, I was especially interested in the results shown in Figure 8, where you show results with “inference-time scaling strategies”. Could you please help clarify the following details?
-
On which dataset were the result in Figure 8 obtained?
• Emu Test set or MagicBrush test set?
-
How many samples (images) were evaluated in the inference-time scaling strategies?
This information will greatly help me reproduce and compare the results under the correct settings.
Thanks again for your excellent work and for taking the time to answer!
Best regards,