-
Notifications
You must be signed in to change notification settings - Fork 749
Description
Hi all, anytime I try batch inference using InternVL Flash I get the error:
RuntimeError: Tensors must have same number of dimensions: got 2 and 1
Looking further into it, the way that modeling_internvl_chat.py is written it doesn't seem like the code will work with a batch greater than 1. Is this by design?
When flash_mode is used, this func is called: https://huggingface.co/OpenGVLab/InternVL3_5-2B-Flash/blob/main/modeling_internvl_chat.py#L562
Which then uses self.get_image_num_per_sample(input_ids) / 256
And then this method tries to reduce down to 1 dim here:
https://huggingface.co/OpenGVLab/InternVL3_5-2B-Flash/blob/main/modeling_internvl_chat.py#L284
So in the case that there is more than 1 dimension I get the above error and am unsure if this flash_mode is meant to support batches and why.