SSIM implementation #1988

MatthiasLen · 2023-08-08T11:30:42Z

MatthiasLen
Aug 8, 2023

Hi all,

is the SSIM implementation src/torchmetrics/functional/image/ssim.py really stable ?

For Gaussian Kernel and small sigmas, e.g. sigma=0.2, we can have 2D kernel size [1,1] or 3d kernel size [1,1,1], which leads to padding values of 0, see Line 127ff

    pad_h = (gauss_kernel_size[0] - 1) // 2
    pad_w = (gauss_kernel_size[1] - 1) // 2

This on the other hand leads to degenerate tensors in Line 166 ff

    if is_3d:
        ssim_idx = ssim_idx_full_image[..., pad_h:-pad_h, pad_w:-pad_w, pad_d:-pad_d]
    else:
        ssim_idx = ssim_idx_full_image[..., pad_h:-pad_h, pad_w:-pad_w]

due to empty slicing (0:-0) and finally to NaN values in the return statement

    return ssim_idx.reshape(ssim_idx.shape[0], -1).mean(-1)

A simple solution for this would be avoiding the negative indexing , e.g. modifying the respective line in the following manner

        s = ssim_idx_full_image.shape
        ssim_idx = ssim_idx_full_image[..., pad_h:s[-3]-pad_h, pad_w:s[-2]-pad_w, pad_d:s[-1]-pad_d]

Similarly for large sigmas (i.e. large kernels) the removal of padding as implemented in Line 166ff can lead in the aforementioned setting to NaN values. A sufficient condition that it works would be for example

assert 2* pad_h < target.shape[2]
assert 2* pad_w < target.shape[3]
assert 2* pad_d < target.shape[4]

This is due to the fact that the valid convolution in Line 149 eats up the padding and afterwards a margin of size padding is again removed in Line 166 ff. Is this the intended behaviour ? In first oder, this appreas to be a complicated variant of directly applying a valid convolution to the unpadded tensor.

IMHO this is a importatnt issue when applying this in medical imaging. E.g. in MRI imaging we often encounter the case where the z dimension is small (stack of few MRI slices, each slice having a high resolution). In the current implementation, for the final SSIM calculation a relative high fraction of slices may be removed.

For example: sig=1.5, gaussian_kernel= True --> padding approx 5 --> 5 top and 5 bottom slices from SSIM map will be ignored due to Line 166 ff. When we have a slice stack of 20, e.g. image tensor size 20x200x200, this feels limiting. Any thoughts?

Best, Matthias

Borda · 2025-08-11T15:28:35Z

Borda
Aug 11, 2025
Maintainer

Thank you for highlighting this issue with the SSIM implementation regarding small and large Gaussian kernel sigmas leading to padding and empty slicing problems, which result in NaN values.

Your analysis about the padding calculations causing empty slices and the double removal of padding (valid convolution followed by manual cropping) sounds accurate and seems to be a legitimate concern, especially for small image stacks like in medical imaging.

This type of issue with edge handling and padding size checks is critical for stable SSIM computation.

I recommend first making sure you are using the latest version of TorchMetrics, as there have been recent fixes and improvements related to SSIM padding behavior and numerical stability in the last few releases (e.g., v1.7.4 in July 2025 includes some SSIM-related fixes).

If the problem persists in the latest release, it would be great to open a detailed issue or pull request with your proposed fix and test cases reflecting these corner cases (small z-dimension stacks, small and large sigma values). This will help the maintainers evaluate and integrate a solution more quickly.

Thank you again for your detailed investigation and suggestions!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SSIM implementation #1988

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

SSIM implementation #1988

Uh oh!

Uh oh!

MatthiasLen Aug 8, 2023

Replies: 1 comment

Uh oh!

Borda Aug 11, 2025 Maintainer

MatthiasLen
Aug 8, 2023

Borda
Aug 11, 2025
Maintainer