Hi, thank you for the great work!
I have a question about how position_ids are handled in Dream-VLA.
In this code:
|
position_ids = attention_mask.cumsum(dim=1) |
position_ids seem to start from 1 (since they are computed from attention_mask.cumsum(...)), whereas in the default Dream model they usually start from 0.
Would this cause any inconsistency or mismatch in positional encoding between Dream-VLA and the base Dream model?
Hi, thank you for the great work!
I have a question about how position_ids are handled in Dream-VLA.
In this code:
Dream-VLX/vla/dreamvla/modeling_dreamvl.py
Line 1769 in ebc1281
position_ids seem to start from 1 (since they are computed from attention_mask.cumsum(...)), whereas in the default Dream model they usually start from 0.
Would this cause any inconsistency or mismatch in positional encoding between Dream-VLA and the base Dream model?