-
Notifications
You must be signed in to change notification settings - Fork 28
Description
Dear Authors,
Thank you for your excellent work and for making the code publicly available. I have a question regarding the evaluation metrics reported in Table 1 of the paper:
Are the accuracy numbers shown in Table 1 based on:
Single-step prediction (without considering autoregressive error accumulation), or
Multi-step rollout (where each prediction is fed back as input for the next step)?
If it's the latter (multi-step), could you please specify:
How many steps were used for the evaluation?
Whether the rollout uses ground truth data for conditioning at any point or if it's fully autoregressive?
This distinction would be very helpful for properly comparing with other methods and for reproduction purposes.
Thank you for your time and consideration!