Skip to content

Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B #13662

Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B

Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B #13662

Triggered via pull request May 8, 2025 15:32
Status Success
Total duration 1h 18m 7s
Artifacts

server.yml

on: pull_request
Matrix: server
Fit to window
Zoom out
Zoom in