Skip to content

Commit dd56e93

Browse files
authored
[3/N][Refactor][Qwen3-Next] Refacotr model structure and fix bug by vllm #25400 (#3142)
### What this PR does / why we need it? Refactor model structure in qwen3_next.py to reduce code line. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? ``` def main(): prompts = [ "The future of AI is", ] # Create a sampling params object. sampling_params = SamplingParams(max_tokens=100, temperature=0.6, top_k=40, top_p=0.95) # Create an LLM. llm = LLM( model="Qwen/Qwen3-Next-80B-A3B-Instruct", tensor_parallel_size=4, enforce_eager=True, trust_remote_code=True, max_model_len=256, gpu_memory_utilization=0.7, block_size=64, ) # Generate texts from the prompts. outputs = llm.generate(prompts, sampling_params) for output in outputs: prompt = output.prompt generated_text = output.outputs[0].text print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}") ``` - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@releases/v0.11.0 --------- Signed-off-by: Icey <[email protected]>
1 parent 4ff422c commit dd56e93

File tree

2 files changed

+27
-363
lines changed

2 files changed

+27
-363
lines changed

vllm_ascend/models/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,4 +53,4 @@ def register_model():
5353
)
5454
ModelRegistry.register_model(
5555
"Qwen3NextForCausalLM",
56-
"vllm_ascend.models.qwen3_next:Qwen3NextForCausalLM")
56+
"vllm_ascend.models.qwen3_next:CustomQwen3NextForCausalLM")

0 commit comments

Comments
 (0)