Thanks for sharing this interesting project!
How can I obtain the ground truth eef_position_delta and eef_rotation_delta for each step from the DROID dataset?
I tried using the original 15fps DROID videos, but the actions predicted by the IDM do not match the original ground truth actions. Could you please provide some guidance?
Thank you again!