-
Notifications
You must be signed in to change notification settings - Fork 10
Description
First of all, thank you 🙏 for open-sourcing this repo. It’s been very useful for experimenting with landmark detection networks, and the codebase is clean and easy to extend. I think I found 1 small issue commonly seen in ITK to numpy conversion.
When using MaskDataset with 3D data, the mask channel/axes order is incorrect after loading. This leads to a consistent X–Y swap for landmarks extracted via mask[0].nonzero() and masks generated by _create_mask, causing visual/metric misalignment between image and landmark/heatmap.
Environment / Context:
MaskDatasetfromlandmarker.data- MONAI
LoadImage(image_only=True, ensure_channel_first=True) - NIfTI/ITK data (nibabel/SimpleITK pipelines)
- 3D case (
spatial_dims=3)
What happens now
LoadImage(..., ensure_channel_first=True) returns shapes:
- 2D:
(C, H, W) - 3D:
(C, H, W, D)
Current code then applies:
# 2D
Transpose(indices=[0, 2, 1]) # (C, W, H)
# 3D
Transpose(indices=[0, 3, 2, 1]) # (C, D, W, H)
Later code treats masks/images as (C, D, H, W), and landmarks are assumed (z, y, x)—but because the mask was transposed to (C, D, W, H), nonzero() returns (z, x, y). This misplaces landmarks and any generated heatmaps (Z slice looks right, X/Y are off).
Proposed fix
if mask_paths is not None:
if mask_paths[0].endswith(".npy"):
self.mask_loader = Compose(
[
LoadImage(image_only=True, ensure_channel_first=True),
]
)
elif self.spatial_dims == 2:
self.mask_loader = Compose(
[
LoadImage(image_only=True, ensure_channel_first=True),
- Transpose(indices=[0, 2, 1]),
+ # No transpose needed: LoadImage returns (C, H, W)
]
)
else:
self.mask_loader = Compose(
[
LoadImage(image_only=True, ensure_channel_first=True),
- Transpose(indices=[0, 3, 2, 1]),
+ Transpose(indices=[0, 3, 1, 2]), # (C, H, W, D) -> (C, D, H, W)
]
)