SSIM implementation #1988
Replies: 1 comment
-
Thank you for highlighting this issue with the SSIM implementation regarding small and large Gaussian kernel sigmas leading to padding and empty slicing problems, which result in NaN values. Your analysis about the padding calculations causing empty slices and the double removal of padding (valid convolution followed by manual cropping) sounds accurate and seems to be a legitimate concern, especially for small image stacks like in medical imaging. This type of issue with edge handling and padding size checks is critical for stable SSIM computation. I recommend first making sure you are using the latest version of TorchMetrics, as there have been recent fixes and improvements related to SSIM padding behavior and numerical stability in the last few releases (e.g., v1.7.4 in July 2025 includes some SSIM-related fixes). If the problem persists in the latest release, it would be great to open a detailed issue or pull request with your proposed fix and test cases reflecting these corner cases (small z-dimension stacks, small and large sigma values). This will help the maintainers evaluate and integrate a solution more quickly. Thank you again for your detailed investigation and suggestions! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
is the SSIM implementation src/torchmetrics/functional/image/ssim.py really stable ?
For Gaussian Kernel and small sigmas, e.g. sigma=0.2, we can have 2D kernel size [1,1] or 3d kernel size [1,1,1], which leads to padding values of 0, see Line 127ff
This on the other hand leads to degenerate tensors in Line 166 ff
due to empty slicing (0:-0) and finally to NaN values in the return statement
A simple solution for this would be avoiding the negative indexing , e.g. modifying the respective line in the following manner
Similarly for large sigmas (i.e. large kernels) the removal of padding as implemented in Line 166ff can lead in the aforementioned setting to NaN values. A sufficient condition that it works would be for example
This is due to the fact that the valid convolution in Line 149 eats up the padding and afterwards a margin of size padding is again removed in Line 166 ff. Is this the intended behaviour ? In first oder, this appreas to be a complicated variant of directly applying a valid convolution to the unpadded tensor.
IMHO this is a importatnt issue when applying this in medical imaging. E.g. in MRI imaging we often encounter the case where the z dimension is small (stack of few MRI slices, each slice having a high resolution). In the current implementation, for the final SSIM calculation a relative high fraction of slices may be removed.
For example: sig=1.5, gaussian_kernel= True --> padding approx 5 --> 5 top and 5 bottom slices from SSIM map will be ignored due to Line 166 ff. When we have a slice stack of 20, e.g. image tensor size 20x200x200, this feels limiting. Any thoughts?
Best, Matthias
Beta Was this translation helpful? Give feedback.
All reactions