-
Notifications
You must be signed in to change notification settings - Fork 5k
Open
Description
Description
During piecewise CUDA graph warmup compilation for nvidia/Qwen3.5-397B-A17B-NVFP4 (4-GPU, FP4 quantization) on B200, a segmentation fault occurs in the Triton NVIDIA driver backend while executing the fused MoE experts kernel.
The crash happens at ~93% of the "Compiling num tokens" phase (69/74 iterations).
Error Stack Trace
Fatal Python error: Segmentation fault
Current thread (most recent call first):
File "triton/backends/nvidia/driver.py", line 668 in inner
File "triton/backends/nvidia/driver.py", line 712 in __call__
File "triton/runtime/jit.py", line 757 in run
File "triton_kernels/matmul_ogs.py", line 467 in matmul_ogs
File "sglang/srt/layers/moe/fused_moe_triton/triton_kernels_moe.py", line 306 in triton_kernel_fused_experts_with_bias
File "sglang/srt/layers/moe/moe_runner/triton_kernels.py", line 115 in run
File "sglang/srt/layers/moe/moe_runner/runner.py", line 117 in run
File "sglang/srt/layers/quantization/unquant.py", line 423 in forward_cuda
File "sglang/srt/layers/moe/fused_moe_triton/layer.py", line 1034 in run_moe_core
File "sglang/srt/layers/moe/fused_moe_triton/layer.py", line 1013 in forward_impl
File "sglang/srt/models/gpt_oss.py", line 269 in moe_impl
...
File "sglang/srt/model_executor/piecewise_cuda_graph_runner.py", line 406 in warmup_compile
File "sglang/srt/model_executor/piecewise_cuda_graph_runner.py", line 309 in __init__
File "sglang/srt/model_executor/model_runner.py", line 2450 in init_piecewise_cuda_graphs
Environment
- GPU: NVIDIA B200
- Model:
nvidia/Qwen3.5-397B-A17B-NVFP4(FP4 quantization, TP=4) - Attention backend:
trtllm_mha - CI job:
stage-c-test-4-gpu-b200 (0)in PR Test run
Analysis
The segfault originates in the Triton NVIDIA driver backend (triton/backends/nvidia/driver.py:668) during JIT compilation/execution of the MoE matmul_ogs kernel. This appears to be a Triton + B200 (SM100) driver-level issue during CUDA graph warmup.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels