Skip to content

MaskDataset axis swap causes XY misalignment for 3D masks (Transpose indices wrong) #9

@Cerann

Description

@Cerann

First of all, thank you 🙏 for open-sourcing this repo. It’s been very useful for experimenting with landmark detection networks, and the codebase is clean and easy to extend. I think I found 1 small issue commonly seen in ITK to numpy conversion.

When using MaskDataset with 3D data, the mask channel/axes order is incorrect after loading. This leads to a consistent X–Y swap for landmarks extracted via mask[0].nonzero() and masks generated by _create_mask, causing visual/metric misalignment between image and landmark/heatmap.

Environment / Context:

  • MaskDataset from landmarker.data
  • MONAI LoadImage(image_only=True, ensure_channel_first=True)
  • NIfTI/ITK data (nibabel/SimpleITK pipelines)
  • 3D case (spatial_dims=3)

What happens now
LoadImage(..., ensure_channel_first=True) returns shapes:

  • 2D: (C, H, W)
  • 3D: (C, H, W, D)

Current code then applies:

# 2D
Transpose(indices=[0, 2, 1])    # (C, W, H)
# 3D
Transpose(indices=[0, 3, 2, 1]) # (C, D, W, H)

Later code treats masks/images as (C, D, H, W), and landmarks are assumed (z, y, x)—but because the mask was transposed to (C, D, W, H), nonzero() returns (z, x, y). This misplaces landmarks and any generated heatmaps (Z slice looks right, X/Y are off).

Proposed fix

 if mask_paths is not None:
     if mask_paths[0].endswith(".npy"):
         self.mask_loader = Compose(
             [
                 LoadImage(image_only=True, ensure_channel_first=True),
             ]
         )
     elif self.spatial_dims == 2:
         self.mask_loader = Compose(
             [
                 LoadImage(image_only=True, ensure_channel_first=True),
-                Transpose(indices=[0, 2, 1]),
+                # No transpose needed: LoadImage returns (C, H, W)
             ]
         )
     else:
         self.mask_loader = Compose(
             [
                 LoadImage(image_only=True, ensure_channel_first=True),
-                Transpose(indices=[0, 3, 2, 1]),
+                Transpose(indices=[0, 3, 1, 2]),  # (C, H, W, D) -> (C, D, H, W)
             ]
         )

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingdocumentationImprovements or additions to documentation

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions