[hipDNN] Improve sample failure output with tensor diffs by Bingtagui404 · Pull Request #5591 · ROCm/rocm-libraries

Bingtagui404 · 2026-03-19T00:37:51Z

Description

When a CPU validation failure occurs in the hipDNN samples, the output only tells you which tensor failed — not how it failed:

CPU reference validation:
  y: failed

This PR adds a reusable tensor diff utility and wires it into all sample validation paths so failures automatically print detailed diagnostics:

CPU reference validation:
  y: failed
  Tensor diff for "y":
    Total elements: 65536
    Mismatched:     42 (0.06%)
    Max abs diff:   1.234567e-03 at [0, 2, 14, 7]
    Mean abs diff:  3.456789e-04
    Worst mismatches:
      [0, 2, 14, 7]: ref=0.543210, impl=0.544444, diff=1.234567e-03
      ...

Changes

New: test_sdk/.../utilities/TensorDiff.hpp — header-only utility providing:
- computeTensorDiff<T>() — element-wise comparison with summary statistics
- printTensorDiffSummary() — formatted output
- validateAndReport<T>() — drop-in replacement combining allClose() + status print + diff on failure
Modified: All 14 sample .cpp files to use validateAndReport<T>() instead of manual allClose() + cout pattern

Safety

Shape/element-count mismatches are detected before element-wise comparison to prevent out-of-bounds access
computeTensorDiff runs single-threaded to avoid data races in summary accumulation
maxMismatches == 0 is handled as "summary only" mode

Fixes #5547

Add a reusable TensorDiff utility to test_sdk that computes element-wise tensor comparisons and prints summary statistics (mismatch count, max/mean absolute error, worst mismatches) when CPU validation fails. Wire it into all 14 sample validation paths via a validateAndReport<T>() helper so failures automatically print the diff instead of just "failed". Shape mismatches are detected separately and reported without attempting element-wise comparison to avoid out-of-bounds access. Fixes ROCm#5547

Bingtagui404 requested a review from a team as a code owner March 19, 2026 00:37

github-actions bot added the project: hipdnn label Mar 19, 2026

assistant-librarian bot added the external contribution Code contribution from users community.. label Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hipDNN] Improve sample failure output with tensor diffs#5591

[hipDNN] Improve sample failure output with tensor diffs#5591
Bingtagui404 wants to merge 1 commit intoROCm:developfrom
Bingtagui404:users/Bingtagui404/tensor-diff-utility

Bingtagui404 commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Bingtagui404 commented Mar 19, 2026

Description

Changes

Safety

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant