Skip to content

Conversation

@AditiThirdEye
Copy link

Implements RMSNormalization operator for TensorRT ONNX parser, enabling deployment of modern transformer architectures (LLaMA, Mistral, etc.) that use RMSNorm instead of LayerNorm.

Implementation details:

  • Computes Y = (X / sqrt(mean(X^2) + epsilon)) * scale
  • Supports FP32, FP16, and BF16 data types
  • Handles axis attribute for normalization dimensions
  • Supports epsilon and stash_type attributes per ONNX spec

Changes:

  • onnxOpImporters.cpp: Add RMSNormalization importer using TensorRT primitive operations (ElementWise, Reduce, Unary)
  • onnxOpCheckers.cpp: Add empty checker for RMSNormalization
  • docs/operators.md: Add RMSNormalization to supported operators matrix
  • onnx_backend_test.py: Include RMSNormalization tests

Fixes onnx/onnx-tensorrt#4639 (via NVIDIA/TensorRT#4639)

Implements RMSNormalization operator for TensorRT ONNX parser, enabling
deployment of modern transformer architectures (LLaMA, Mistral, etc.)
that use RMSNorm instead of LayerNorm.

Implementation details:
- Computes Y = (X / sqrt(mean(X^2) + epsilon)) * scale
- Supports FP32, FP16, and BF16 data types
- Handles axis attribute for normalization dimensions
- Supports epsilon and stash_type attributes per ONNX spec

Changes:
- onnxOpImporters.cpp: Add RMSNormalization importer using TensorRT
  primitive operations (ElementWise, Reduce, Unary)
- onnxOpCheckers.cpp: Add empty checker for RMSNormalization
- docs/operators.md: Add RMSNormalization to supported operators matrix
- onnx_backend_test.py: Include RMSNormalization tests

Fixes onnx/onnx-tensorrt#4639 (via NVIDIA/TensorRT#4639)

Signed-off-by: Aditi_Pandey <54734131+AditiThirdEye@users.noreply.github.com>
@AditiThirdEye
Copy link
Author

@kevinch-nv @yuanyao-nv Could you please review this PR when you have a chance?

This adds ONNX opset 23 RMSNormalization support, enabling deployment of modern LLM architectures (LLaMA, Mistral, etc.) that use RMSNorm.

Related issue: NVIDIA/TensorRT#4639

Thanks!

@yuanyao-nv
Copy link
Collaborator

Thanks for your contribution. RMSNorm support will be available in the 10.15 release, please stay tuned.

@AditiThirdEye
Copy link
Author

Thanks @yuanyao-nv! Just to clarify - will this PR be considered for the 10.15 release, or is there already an internal implementation in progress?
Happy to address any feedback if you'd like to use this PR, or close it if there's already work underway internally.

@yuanyao-nv
Copy link
Collaborator

It's the latter. There is already an internal implementation ready to be released in 10.15.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants