-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Issue Description
I'm experiencing unexpectedly high latency when running image segmentation using Isaac ROS DNN Inference on a Jetson Orin Nano 8GB. The TensorRT node appears to be the primary bottleneck, with processing delays averaging ~240-260ms.
Environment
Hardware: Jetson Orin Nano 8GB
Model: PeopleSegNet (deployable_quantized_vanilla_unet_onnx_v2.0)
Isaac ROS Version: 3.2
JetPack Version: 6.2
CUDA Version: 12.6
Command Used
bashros2 launch isaac_ros_examples isaac_ros_examples.launch.py
launch_fragments:=zed_mono_rect,unet
engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/peoplesemsegnet/deployable_quantized_vanilla_unet_onnx_v2.0/1/model.plan
input_binding_names:=['input_1:0']
output_binding_names:=['argmax_1']
network_output_type:='argmax'
interface_specs_file:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_unet/zed2_quickstart_interface_specs.json
Performance Measurements
TensorRT Node Input (/tensor_sub)
ros2 topic delay /tensor_sub --window 100
Average delay: ~240-260ms
Min: 108ms
Max: 348ms
Std dev: ~0.04s
TensorRT Node Output (/tensor_pub)
ros2 topic delay /tensor_pub --window 100
Average delay: ~200-214ms
Min: 123ms
Max: 287ms
Std dev: ~0.03s
Analysis
The latency measurements show that:
Total pipeline latency is averaging 240-260ms
The TensorRT node itself appears to be adding significant processing time
There's considerable variance in processing times (std dev ~40ms on input)
Expected Behavior
For a Jetson Orin Nano 8GB with a quantized UNet model, I would expect much lower latency, ideally in the range of 140-160ms for real-time performance from image-to-mask
