[Relax] Fix flaky test_conv2d_offload by increasing float32 tolerance (#18455)

tlopex · web-flow · commit ebd169b547ac · 2025-11-18T10:42:01.000-05:00
The `test_conv2d_offload` test for float32 dtype was intermittently
failing in CI with errors like:
```
Mismatched elements: 17 / 524288 (0.00324%)
Max absolute difference: 0.02001762
Max relative difference: 3193.5
```
The test was using `rtol=1e-2, atol=1e-2` (0.01) tolerance, which may be
too strict for comparing cuDNN and LLVM implementations. The max
absolute difference of ~0.02 exceeded the threshold, causing flaky test
failures. This PR increases the tolerance for float32 from `1e-2` to
`2.5e-2` (0.025) to accommodate the observed numerical differences
between cuDNN and LLVM convolution implementations.
diff --git a/tests/python/relax/test_codegen_cudnn.py b/tests/python/relax/test_codegen_cudnn.py
@@ -197,7 +197,9 @@ def test_conv2d_offload(data_shape, weight_shape, dtype, with_bias, activation):
         # see https://github.com/apache/tvm/pull/18319
         tvm.testing.assert_allclose(out, ref, rtol=3e-1, atol=3e-1)
     else:
-        tvm.testing.assert_allclose(out, ref, rtol=1e-2, atol=1e-2)
+        # Increased tolerance to 2.5e-2 to prevent flaky test due to numerical
+        # differences between cuDNN and LLVM implementations
+        tvm.testing.assert_allclose(out, ref, rtol=2.5e-2, atol=2.5e-2)
 
 
 @pytest.mark.skip(reason="flaky test")