As my understanding (Please correct me if I'm wrong), currently LitGPT is not supporting Multimodality as the Pretrain, Finetune, Chat and Evals is not accepting any other modality inputs. I would like to help to support this if it's on the plan of improvements for LitGPT.