Skip to content

InternVL3_5 Flash models: Flash mode doesn't support batches #1239

@Sibajar

Description

@Sibajar

Hi all, anytime I try batch inference using InternVL Flash I get the error:

RuntimeError: Tensors must have same number of dimensions: got 2 and 1

Looking further into it, the way that modeling_internvl_chat.py is written it doesn't seem like the code will work with a batch greater than 1. Is this by design?

When flash_mode is used, this func is called: https://huggingface.co/OpenGVLab/InternVL3_5-2B-Flash/blob/main/modeling_internvl_chat.py#L562

Which then uses self.get_image_num_per_sample(input_ids) / 256

And then this method tries to reduce down to 1 dim here:
https://huggingface.co/OpenGVLab/InternVL3_5-2B-Flash/blob/main/modeling_internvl_chat.py#L284

So in the case that there is more than 1 dimension I get the above error and am unsure if this flash_mode is meant to support batches and why.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions