Skip to content

[Bug] [Relax][ONNX] NaN handling inconsistency in Relu, Sign, ReduceMax, ReduceMin vs ONNX Runtime #19572

@wuyii8941

Description

@wuyii8941

Summary

Multiple operators handle NaN differently from ONNX Runtime when accessed through the ONNX frontend:

  1. Relu(NaN) → 0 (ORT: NaN)
  2. Sign(NaN) → 0 (ORT: NaN)
  3. ReduceMax/ReduceMin — position-dependent NaN behavior:
    • ReduceMax([NaN, 1.0]) → 1.0 (ORT: NaN)
    • ReduceMax([2.0, NaN]) → NaN (ORT: 2.0)

Related: #xxx (bug_019, reduce_max/min NaN CPU vs CUDA at Relax IR level)

Reproduction

import numpy as np
import onnx
from onnx import helper, TensorProto, numpy_helper
import onnxruntime as ort
import tvm
from tvm import relax
from tvm.relax.frontend.onnx import from_onnx

def run_tvm(model, inputs):
    model = onnx.shape_inference.infer_shapes(model)
    mod = from_onnx(model)
    pipeline = tvm.ir.transform.Sequential([relax.transform.LegalizeOps()])
    exe = tvm.relax.build(pipeline(mod), target="llvm")
    vm = tvm.relax.VirtualMachine(exe, device=tvm.cpu())
    tvm_ins = [tvm.runtime.tensor(v, device=tvm.cpu()) for v in inputs]
    return vm["main"](*tvm_ins).numpy()

# Relu
x = np.array([np.nan, 1.0, np.nan, -2.0], dtype=np.float32)
X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [4])
Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, [4])
node = helper.make_node("Relu", ["X"], ["Y"])
graph = helper.make_graph([node], "test", [X], [Y])
model = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 18)])

sess = ort.InferenceSession(model.SerializeToString())
print("ORT Relu:", sess.run(None, {"X": x})[0])   # [nan  1. nan  0.]
print("TVM Relu:", run_tvm(model, [x]))             # [0.   1. 0.   0.]

# ReduceMax
x2 = np.array([[np.nan, 1.0], [2.0, np.nan]], dtype=np.float32)
X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [2, 2])
Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, None)
axes_init = numpy_helper.from_array(np.array([1], dtype=np.int64), "axes")
node = helper.make_node("ReduceMax", ["X", "axes"], ["Y"], keepdims=0)
graph = helper.make_graph([node], "test", [X], [Y], initializer=[axes_init])
model = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 18)])

sess = ort.InferenceSession(model.SerializeToString())
print("ORT ReduceMax:", sess.run(None, {"X": x2})[0])  # [nan  2.]
print("TVM ReduceMax:", run_tvm(model, [x2]))           # [ 1. nan]

Root cause

  • Relu: Lowered to max(x, 0) using fmax semantics — NaN treated as missing value
  • Sign: Comparison chain (x > 0 → 1, x < 0 → -1, else 0) — NaN falls to default 0
  • ReduceMax/Min: Left-fold with fmax/fmin — NaN propagation depends on position in fold order

Note

We acknowledge that the ONNX spec does not normatively require NaN propagation for these operators. However, ONNX Runtime (the reference implementation) propagates NaN consistently, and TVM's behavior causes silent numerical divergence when migrating models between runtimes. We report this as a behavioral inconsistency.

Environment

  • TVM: 0.24.dev0, commit 0b0afd8 (2026-04-24)
  • Python: 3.11
  • OS: Linux

cc @KJlaccHoeUM9l @junrushao

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address ittype: bug

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions