I am really inspired of and thank for your nice work.
The question is "Why is text encoder frozen when training?".
When I fine-tune VISTA model using other dataset such as M-BEIR, the results without freezing is better than those with frozen text encoder.
I just wonder your intention of frozen text encoder.
Thank you.