-
|
Hi everyone, I am encountering a crash when running Sionna RT (v1.2.1) on an NVIDIA H100 NVL GPU inside a Docker container.
Issue: The script crashes at paths.cir() with the following error: Here is a snippet that reproduces the crash: import os
import tensorflow as tf
import mitsuba as mi
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'
try:
mi.set_variant('cuda_ad_rgb')
print(" Mitsuba variant set to: cuda_ad_rgb")
except Exception as e:
print(f" Failed to set variant: {e}")
exit()
from sionna.rt import load_scene, PlanarArray, Transmitter, Receiver, PathSolver
import sionna.rt.scenes
scenes_path = os.path.dirname(sionna.rt.scenes.__file__)
scene_path = os.path.join(scenes_path, 'simple_reflector', 'simple_reflector.xml')
scene = load_scene(scene_path)
scene.tx_array = PlanarArray(num_rows=1, num_cols=1, pattern='iso', polarization='V')
scene.rx_array = PlanarArray(num_rows=1, num_cols=1, pattern='iso', polarization='V')
scene.add(Transmitter(name='tx', position=[0, 0, 10]))
scene.add(Receiver(name='rx', position=[10, 0, 1.5]))
solver = PathSolver()
paths = solver(scene)
print(" Computing CIR... (Crash happens here)")
try:
a, tau = paths.cir()
print(" CIR Computed successfully!")
except Exception as e:
print(f"\n CRASH DETECTED:\n{e}")Has anyone experienced this specific Dr.Jit loop issue on Hopper GPUs? Is there a recommended Dr.Jit version or workaround for H100? Thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
|
Hello @natanzi, Please share the full error log so that we can better understand what is going wrong. |
Beta Was this translation helpful? Give feedback.
-
|
Hello @merlinND , BR, |
Beta Was this translation helpful? Give feedback.
-
|
@merlinND Thanks, the problem is solved. Just a quick question: I’m currently using one H100, but I also have access to other GPUs like H200 and H100. Is there any way to perform ray tracing or channel genration across multiple GPUs in parallel, or via some other approach? Best regards, |
Beta Was this translation helpful? Give feedback.
It seems the issue is simply due to trying to use the
cuda_ad_rgbvariant instead of thecuda_ad_mono_polarizedvariant. I don't think it's related to H100.Did you add the
mi.set_variant()call for a specific reason? Sionna RT should manage the variants automatically (including falling back to LLVM if CUDA is not available).