Qwen3 demo batch size support for non-mirage baseline by dcw02 · Pull Request #403 · mirage-project/mirage

dcw02 · 2025-07-15T22:43:45Z

Description of changes:

This PR adds batch size > 1 support to the non-mirage baseline in the Qwen3 demo.

Related Issues:

Linked Issues:

Issue [Feature Request] - qwen3-8B demo; larger batch size #358

dcw02 · 2025-07-15T22:56:58Z

I haven't looked into it, but the mirage path of the Qwen3 demo no longer generates/outputs tokens since commit 22a0bdf

NorthmanPKU · 2025-07-16T01:54:00Z

Hi @dcw02 can I know the context to reproduce the no generation problem? Thanks

dcw02 · 2025-07-16T03:17:03Z

Hi @dcw02 can I know the context to reproduce the no generation problem? Thanks

@NorthmanPKU Here is a repro script using Modal:

import modal

app = modal.App("mirage-repro")

image = (
    modal.Image.from_registry("nvidia/cuda:12.9.1-cudnn-devel-ubuntu24.04", add_python="3.12")
    .apt_install("git", "libopenmpi-dev")
    .pip_install("torch==2.7.1", "mpi4py==4.1.0", "transformers==4.52.4")
    .run_commands("git clone --recursive --branch mpk https://www.github.com/mirage-project/mirage /mirage")
    .env({"MIRAGE_HOME": "/mirage", "PMIX_MCA_gds": "hash"})
    .run_commands("cd mirage && git checkout 22a0bdf")
    .run_commands("uv pip install --system -e /mirage -v")
)

hf_cache_vol = modal.Volume.from_name("huggingface-cache", create_if_missing=True)

@app.function(image=image, gpu="L40S", volumes={"/root/.cache/huggingface": hf_cache_vol})
def test():
    import subprocess
    subprocess.run("python /mirage/demo/qwen3/demo.py --use-mirage", check=True, shell=True)

The output should look something like:

Finished Launch Persistent Kernel
system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.
user
import numpy as np
                import matplotlib.pyplot as plt

                # Calculate the average
                average_throughput = np.mean(tokens_per_sec_arr)
                print(f"Average Throughput: {average_throughput} tokens/sec")

                # Plotting the histogram
                plt.hist(tokens_per_sec_arr, bins=20, color='blue', edgecolor='black', alpha=0.7)
                plt.title('Histogram of Throughput Values')
                plt.xlabel('Tokens per Second')
                plt.ylabel('Frequency')
                plt.axvline(average_throughput, color='red', linestyle='dashed', linewidth=1)
                plt.text(average_throughput*0.9, max(plt.ylim())*0.9, f'Average: {average_throughput:.2f}', color = 'red')
                plt.show()
                
Can you please change x axis to start from 0
assistant
<think>
Prompt length 212, generate length 0, per-token latency inf ms

I think only L40S/sm_89 is broken (my devbox just happened to have L40S), I tested and it works on A100, H100, and H200. If you want to make small edits you can also shell into the Modal container with modal shell main.py::test.

dcw02 · 2025-07-17T19:15:12Z

@NorthmanPKU I fixed the no generation problem in #412

sheng-di · 2025-11-01T16:10:49Z

Where is the CUDA version code with mirage? How should this support the base size parameter? Or is it already supported, or is there a related PR?

dcw02 added 2 commits July 15, 2025 18:59

Add batch_size support to Qwen3 non-mirage path

896b21c

merge upstream changes with spec dec support

5474bd7

Merge branch 'mirage-project:mpk' into feature/qwen3_batch_size

42bcfae

Merge branch 'mirage-project:mpk' into feature/qwen3_batch_size

b6bc395

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen3 demo batch size support for non-mirage baseline#403

Qwen3 demo batch size support for non-mirage baseline#403
dcw02 wants to merge 4 commits intomirage-project:mpkfrom
dcw02:feature/qwen3_batch_size

dcw02 commented Jul 15, 2025

Uh oh!

dcw02 commented Jul 15, 2025

Uh oh!

NorthmanPKU commented Jul 16, 2025

Uh oh!

dcw02 commented Jul 16, 2025

Uh oh!

dcw02 commented Jul 17, 2025

Uh oh!

sheng-di commented Nov 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dcw02 commented Jul 15, 2025

Uh oh!

dcw02 commented Jul 15, 2025

Uh oh!

NorthmanPKU commented Jul 16, 2025

Uh oh!

dcw02 commented Jul 16, 2025

Uh oh!

dcw02 commented Jul 17, 2025

Uh oh!

sheng-di commented Nov 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants