Skip to content

Commit 8e586d1

Browse files
committed
Fix: Ensure stage_id is correctly passed to OmniEngineArgs
Signed-off-by: Bradley <bradley.b.pitt@gmail.com>
1 parent bb011f2 commit 8e586d1

File tree

2 files changed

+4
-3
lines changed

2 files changed

+4
-3
lines changed

vllm_omni/model_executor/models/hunyuan_image3_0/hunyuan_image3_0.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1056,6 +1056,7 @@ def forward_block(
10561056
def compute_logits(
10571057
self,
10581058
hidden_states: torch.Tensor,
1059+
sampling_metadata: SamplingMetadata | None = None,
10591060
) -> torch.Tensor | None:
10601061
logits = self.logits_processor(self.lm_head, hidden_states)
10611062
return logits

vllm_omni/model_executor/stage_configs/hunyuan_image_3_moe.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@ stage_args:
55
- stage_id: 0
66
stage_type: llm # Use llm stage type to launch OmniLLM
77
runtime:
8-
process: true # Run this stage in a separate process
9-
devices: "0" # Visible devices for this stage (CUDA_VISIBLE_DEVICES/torch.cuda.set_device)
8+
process: true # Run this stage in a separate process
9+
devices: "0,1,2,3,4,5,6,7" # Visible devices for this stage (CUDA_VISIBLE_DEVICES/torch.cuda.set_device)
1010
max_batch_size: 1
1111
engine_args:
1212
model_stage: AR
@@ -19,7 +19,7 @@ stage_args:
1919
engine_output_type: latent
2020
enable_prefix_caching: false
2121
max_num_batched_tokens: 32768
22-
tensor_parallel_size: 5
22+
tensor_parallel_size: 8
2323
pipeline_parallel_size: 1
2424
is_comprehension: true
2525
final_output: true

0 commit comments

Comments
 (0)