Skip to content

Improve pre-processed image size for QwenVL #16842

@ngxson

Description

@ngxson

Extracted the discussion from:

In order for the bbox to be correct, I'm thinking about:

  • Implement the correct max_pixels / min_pixels from the original config
  • Pad right/bottom corner of the image if we require to upscale (to multiple of 2*patch_size). This will guarantee that the x/y coordinates stay unchanged. But we also need to check if the original implementation actually use this strategy or not.

CC @broadbit-hu @theo77186

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions