[Model] Add Hunyuan Image3 AR Support by usberkeley · Pull Request #759 · vllm-project/vllm-omni

usberkeley · 2026-01-13T02:50:01Z

Purpose

This PR adds support for the Hunyuan Image3 model to vLLM-Omni. Hunyuan Image3 is a multimodal image generation model developed by Tencent, supporting text-to-image generation tasks.

Test Plan

Text input test

GPU: 8 x L40S (48GB)
TP: 8

Note: The default configuration in hunyuan_image_3_moe.yaml is tensor_parallel_size: 8.

from vllm_omni.entrypoints.omni import Omni

if __name__ == "__main__":
    omni = Omni(model="tencent/HunyuanImage-3.0")
    prompts = [
    {
        "prompt": "<|im_start|>system\nYou are Qwen.<|im_end|>\n<|im_start|>user\nExplain the system architecture for a scalable audio generation pipeline. Answer in 15 words.<|im_end|>\n<|im_start|>assistant\n",
        "modalities": ["text"]
    }
    ]
    omni_outputs = omni.generate(prompts)
    print(omni_outputs[0].request_output[0].outputs[0].text)

Multimodal input test
TODO

Test Result

TODO

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bb011f27c7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm_omni/model_executor/models/hunyuan_image3_0/hunyuan_image3_0.py

hsliuustc0106 · 2026-01-15T06:24:39Z

please paste your test example command

usberkeley · 2026-01-15T13:37:23Z

please paste your test example command

Hi @hsliuustc0106

Text input test

GPU: 8 x L40S (48GB)
TP: 8

Note: The default configuration in hunyuan_image_3_moe.yaml is tensor_parallel_size: 8.

from vllm_omni.entrypoints.omni import Omni

if __name__ == "__main__":
    omni = Omni(model="tencent/HunyuanImage-3.0")
    prompts = [
    {
        "prompt": "<|im_start|>system\nYou are Qwen.<|im_end|>\n<|im_start|>user\nExplain the system architecture for a scalable audio generation pipeline. Answer in 15 words.<|im_end|>\n<|im_start|>assistant\n",
        "modalities": ["text"]
    }
    ]
    omni_outputs = omni.generate(prompts)
    print(omni_outputs[0].request_output[0].outputs[0].text)

Copilot

Pull request overview

Adds initial vLLM-Omni autoregressive (AR) integration for Tencent’s Hunyuan Image3 model, including model registration and a default stage config.

Changes:

Updates AR GPU runner postprocessing to use a shared multimodal-output extraction helper.
Registers HunyuanImage3ForCausalMM in the Omni model registry.
Introduces a new Hunyuan Image3 model implementation + utilities and a new stage config YAML.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
`vllm_omni/worker/gpu_ar_model_runner.py`	Switches to `extract_multimodal_outputs` for postprocessing model outputs.
`vllm_omni/model_executor/stage_configs/hunyuan_image_3_moe.yaml`	Adds a default stage config for running Hunyuan Image3 with the AR worker/scheduler.
`vllm_omni/model_executor/models/registry.py`	Registers the Hunyuan Image3 model architecture for lazy loading.
`vllm_omni/model_executor/models/hunyuan_image3_0/hunyuan_image3_0_utils.py`	Adds Hunyuan-specific RoPE2D + image KV cache helper utilities.
`vllm_omni/model_executor/models/hunyuan_image3_0/hunyuan_image3_0.py`	Adds the main Hunyuan Image3 model implementation (decoder, attention, MoE, weight loading).
`vllm_omni/model_executor/models/hunyuan_image3_0/__init__.py`	Exposes the new model class for import.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vllm_omni/worker/gpu_ar_model_runner.py

vllm_omni/model_executor/stage_configs/hunyuan_image_3_moe.yaml

vllm_omni/model_executor/models/hunyuan_image3_0/hunyuan_image3_0_utils.py

vllm_omni/model_executor/models/hunyuan_image3_0/hunyuan_image3_0.py

david6666666 · 2026-01-28T04:24:07Z

any updated? Tencent has just released https://huggingface.co/tencent/HunyuanImage-3.0-Instruct https://huggingface.co/tencent/HunyuanImage-3.0-Instruct-Distil

usberkeley · 2026-01-29T02:35:19Z

any updated? Tencent has just released https://huggingface.co/tencent/HunyuanImage-3.0-Instruct https://huggingface.co/tencent/HunyuanImage-3.0-Instruct-Distil

Got it. we are working on image encoder and will follow up the update of new release

usberkeley · 2026-02-01T16:41:00Z

Hi @princepride

When you have a moment, please review this code. thanks!

princepride · 2026-02-02T01:44:37Z

@usberkeley Can you rebase your code first, we have changed some code in ar_model_runner.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8d58b560e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm_omni/model_executor/models/hunyuan_image3/hunyuan_image3.py

princepride · 2026-02-04T04:58:07Z

@usberkeley pre-commit failed, PTAL

princepride

@usberkeley good job! just a little advice.

vllm_omni/model_executor/stage_configs/hunyuan_image_3_moe.yaml

vllm_omni/model_executor/models/hunyuan_image3/hunyuan_image3.py

princepride · 2026-02-04T08:06:19Z

@usberkeley I think as for AR model, it should have text as the output

princepride · 2026-02-26T15:54:24Z

@usberkeley I used transformers execute your prompt

hsliuustc0106 · 2026-02-27T00:34:39Z

vllm_omni/model_executor/models/registry.py

+        "hunyuan_image3",
+        "hunyuan_image3",
+        "HunyuanImage3ForConditionalGeneration",
+    ),


This PR adds 3057 lines of new model code with ZERO test coverage. Add tests to verify: (1) model loads correctly, (2) forward pass produces expected output shapes, (3) memory usage is reasonable, (4) integration with vllm-omni pipeline works. Without tests, we cannot validate correctness or prevent regressions.

vllm_omni/model_executor/models/hunyuan_image3/__init__.py

examples/offline_inference/hunyuan_image3/image_to_text.py

hsliuustc0106 · 2026-02-27T00:35:15Z

vllm_omni/model_executor/stage_configs/hunyuan_image_3_moe.yaml

+# The following config has been verified on 8x L40S-48G GPU.
+stage_args:
+  - stage_id: 0
+    stage_type: llm  # Use llm stage type to launch OmniLLM


This config file has no schema validation or documentation. Add comments explaining each parameter's purpose, valid ranges, and default values. Consider adding a schema validator to catch configuration errors early.

examples/offline_inference/hunyuan_image3/image_to_text.py

princepride · 2026-02-27T14:19:55Z

@vllm-omni-reviewer

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

Signed-off-by: princepride <wangzhipeng628@gmail.com>

usberkeley · 2026-02-27T15:35:58Z

@vllm-omni-reviewer

lishunyang12

Left a couple more comments on the latest revision. The mRoPE addition looks solid. Main remaining concern is the load_weights indentation bug and dead code.

vllm_omni/model_executor/models/hunyuan_image3/hunyuan_image3.py

vllm_omni/model_executor/models/hunyuan_image3/autoencoder_kl_3d.py

vllm_omni/model_executor/models/hunyuan_image3/hunyuan_image3.py

Removed the load_sharded_safetensors function that manually loads sharded safetensors files. Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

Removed unused imports for cleaner code. Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

hsliuustc0106 · 2026-02-28T02:05:49Z

@vllm-omni-reviewer

这个哥们好像不干活了，我明天看看为啥

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

princepride · 2026-03-01T15:11:54Z

examples/offline_inference/hunyuan_image3/image.png

Please remove it, we don't need.

david6666666 mentioned this pull request Jan 13, 2026

[RFC]: vLLM-Omni 2026 Q1 Roadmap #677

Open

41 tasks

usberkeley force-pushed the hunyuan-image3 branch 2 times, most recently from ed4d687 to bb011f2 Compare January 14, 2026 09:09

usberkeley marked this pull request as ready for review January 15, 2026 03:21

usberkeley requested a review from hsliuustc0106 as a code owner January 15, 2026 03:21

chatgpt-codex-connector bot reviewed Jan 15, 2026

View reviewed changes

vllm_omni/model_executor/models/hunyuan_image3_0/hunyuan_image3_0.py Outdated Show resolved Hide resolved

vllm_omni/model_executor/models/hunyuan_image3_0/hunyuan_image3_0.py Outdated Show resolved Hide resolved

usberkeley force-pushed the hunyuan-image3 branch from 8e586d1 to 274518b Compare January 15, 2026 13:30

hsliuustc0106 requested a review from Copilot January 27, 2026 16:04

Copilot started reviewing on behalf of hsliuustc0106 January 27, 2026 16:06 View session

Copilot AI reviewed Jan 27, 2026

View reviewed changes

david6666666 mentioned this pull request Jan 30, 2026

[Model] SupportHunyuanImage3 Diffusion Model in vllm-omni #1085

Merged

5 tasks

usberkeley force-pushed the hunyuan-image3 branch from 5fabd63 to eb269b8 Compare February 1, 2026 16:31

usberkeley marked this pull request as draft February 2, 2026 10:14

usberkeley force-pushed the hunyuan-image3 branch 3 times, most recently from 71570e7 to b8d58b5 Compare February 4, 2026 03:17

usberkeley marked this pull request as ready for review February 4, 2026 03:19

chatgpt-codex-connector bot reviewed Feb 4, 2026

View reviewed changes

vllm_omni/model_executor/models/hunyuan_image3/hunyuan_image3.py Outdated Show resolved Hide resolved

vllm_omni/model_executor/models/hunyuan_image3/hunyuan_image3.py Outdated Show resolved Hide resolved

princepride requested changes Feb 4, 2026

View reviewed changes

natureofnature mentioned this pull request Feb 4, 2026

[RFC]: Omni Connector for Full Disaggregation Architecture 2026 Q1 Roadmap #1192

Open

1 task

usberkeley marked this pull request as draft February 4, 2026 09:48

hsliuustc0106 requested a review from Copilot February 5, 2026 15:31

hsliuustc0106 reviewed Feb 27, 2026

View reviewed changes

vllm_omni/model_executor/models/hunyuan_image3/__init__.py Show resolved Hide resolved

hsliuustc0106 reviewed Feb 27, 2026

View reviewed changes

examples/offline_inference/hunyuan_image3/image_to_text.py Show resolved Hide resolved

hsliuustc0106 reviewed Feb 27, 2026

View reviewed changes

examples/offline_inference/hunyuan_image3/image_to_text.py Show resolved Hide resolved

usberkeley and others added 12 commits February 27, 2026 22:42

[model] Add Hunyuan-Image3 AR support (text + image)

b77e9b0

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] format code

1978591

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] format code

fcc168f

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] format code

aaddcfc

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] add base size & ratio token and example

ab52e7f

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] change autoencoder to ReplicatedLinear

ceda66c

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] format code

1dc744c

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] format code

d7c671f

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] format code

470fbfb

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] fix bug

0627a2c

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

[model] trim padding token

571eaa6

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

add mrope and change vit padding token numbers

ed2eaf5

Signed-off-by: princepride <wangzhipeng628@gmail.com>

usberkeley force-pushed the hunyuan-image3 branch from 251d076 to ed2eaf5 Compare February 27, 2026 15:14

lishunyang12 reviewed Feb 27, 2026

View reviewed changes

princepride added 2 commits February 28, 2026 09:01

Remove load_sharded_safetensors function

a4c0725

Removed the load_sharded_safetensors function that manually loads sharded safetensors files. Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

Remove unused imports in autoencoder_kl_3d.py

cddb02d

Removed unused imports for cleaner code. Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

usberkeley added 4 commits February 28, 2026 12:04

fix HunyuanModel#load_weights, remove Upsample and optimize log

fe66e75

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

format code

bcb6cf5

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

add image rgba to rgb and doc for model, optimize example

202a53a

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

format code

c950a19

Signed-off-by: Bradley <bradley.b.pitt@gmail.com>

princepride requested changes Mar 2, 2026

View reviewed changes

examples/offline_inference/hunyuan_image3/image.png

Copy link

Collaborator

princepride Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove it, we don't need.

Conversation

usberkeley commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 commented Jan 15, 2026

Uh oh!

usberkeley commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

david6666666 commented Jan 28, 2026

Uh oh!

usberkeley commented Jan 29, 2026

Uh oh!

usberkeley commented Feb 1, 2026

Uh oh!

princepride commented Feb 2, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

princepride commented Feb 4, 2026

Uh oh!

princepride left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

princepride commented Feb 4, 2026

Uh oh!

princepride commented Feb 26, 2026

Uh oh!

hsliuustc0106 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

princepride commented Feb 27, 2026

Uh oh!

usberkeley commented Feb 27, 2026

Uh oh!

lishunyang12 left a comment

Choose a reason for hiding this comment

Uh oh!

usberkeley commented Jan 13, 2026 •

edited

Loading

usberkeley commented Jan 15, 2026 •

edited

Loading