Hello, I noticed that the previous actions in planning or navigation data is pure text, So the evalution is the same? Hi,
I noticed that in the planning or navigation datasets, the history of previous actions is represented purely as text.
In this setting, is the evaluation conducted in the same way as other modalities, or are there any modality-specific differences we should be aware of?
Thanks in advance.