Add HPU Media Pipeline for InternVL multi-modal model #1944

dsocek · 2025-09-17T22:51:01Z

What does this PR do?

Implements HPU HW media pipeline for multi-modal inference in vLLM for InternVL model.

This feature can triggered with:

VLLM_USE_MEDIA_PIPELINE=1

and also needs

DECODER_MAX_RESIZE=5376

5376 (max image output resolution) = 448 (patch size) * 12 (max patches)

Signed-off-by: Daniel Socek <[email protected]>

dsocek requested review from PatrykWo, afierka-intel, deepvars, jikunshang, kzawora-intel, madamczyk-intel, mgawarkiewicz-intel, michalkuligowski, mswiniarsk, vivekgoe, wpyszka and xuechendi as code owners September 17, 2025 22:51

Add HPU Media Pipeline for InternVL

79bd609

Signed-off-by: Daniel Socek <[email protected]>

dsocek force-pushed the add_mediapipe_for_internvl branch from 89c4524 to 79bd609 Compare September 18, 2025 00:16

dsocek added 2 commits September 18, 2025 15:25

Add support for loading images from externally provided raw bytes

8188cce

Signed-off-by: Daniel Socek <[email protected]>

Fix chat template issue with InternVL

4e6aafd

Signed-off-by: Daniel Socek <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add HPU Media Pipeline for InternVL multi-modal model #1944

Add HPU Media Pipeline for InternVL multi-modal model #1944

dsocek commented Sep 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add HPU Media Pipeline for InternVL multi-modal model #1944

Are you sure you want to change the base?

Add HPU Media Pipeline for InternVL multi-modal model #1944

Conversation

dsocek commented Sep 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dsocek commented Sep 17, 2025 •

edited by github-actions bot

Loading