add missing batch_size consideration #859

xin3he · 2025-09-28T03:16:07Z

Batch_size is not considered before, quantizing llama3.3 70b will only use 1 card and got OOM.

Signed-off-by: He, Xin3 <[email protected]>

wenhuach21 · 2025-09-28T05:57:11Z

the solution is incorrect

xin3he · 2025-09-28T07:23:29Z

Okay, I can share the reproduce step to you so that we can find out a correct solution. @wenhuach21
auto-round --model /models/Llama-3.3-70B-Instruct/ --scheme "MXFP4" --device_map 4,5,6

xin3he · 2025-09-28T08:02:15Z

BTW, only considering the weight size is not correct, for example, the large in_features of down_proj requires more memory to hold the activation.

(mlp): LlamaMLP(
          (gate_proj): Linear(in_features=8192, out_features=28672, bias=False)
          (up_proj): Linear(in_features=8192, out_features=28672, bias=False)
          (down_proj): Linear(in_features=28672, out_features=8192, bias=False)

add missing batch_size consideration

a6bbed1

Signed-off-by: He, Xin3 <[email protected]>

xin3he requested review from n1ck-guo, Kaihui-intel and wenhuach21 September 28, 2025 03:16

fix cli usage

dae72d4

xin3he added the WIP label Sep 29, 2025

xin3he marked this pull request as draft September 29, 2025 06:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add missing batch_size consideration #859

add missing batch_size consideration #859

Uh oh!

xin3he commented Sep 28, 2025

Uh oh!

wenhuach21 commented Sep 28, 2025

Uh oh!

xin3he commented Sep 28, 2025 •

edited

Loading

Uh oh!

xin3he commented Sep 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

add missing batch_size consideration #859

Are you sure you want to change the base?

add missing batch_size consideration #859

Uh oh!

Conversation

xin3he commented Sep 28, 2025

Uh oh!

wenhuach21 commented Sep 28, 2025

Uh oh!

xin3he commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xin3he commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

xin3he commented Sep 28, 2025 •

edited

Loading

xin3he commented Sep 28, 2025 •

edited

Loading