Skip to content

CUDA: Optimize rms_norm_f32 kernel and its fused variants, giving 1-6% perf E2E #30328

CUDA: Optimize rms_norm_f32 kernel and its fused variants, giving 1-6% perf E2E

CUDA: Optimize rms_norm_f32 kernel and its fused variants, giving 1-6% perf E2E #30328

Triggered via pull request September 3, 2025 13:29
Status Success
Total duration 22m 35s
Artifacts

editorconfig.yml

on: pull_request
editorconfig
16s
editorconfig
Fit to window
Zoom out
Zoom in