Skip to content

Feature Request: add min_mask_ratio to DeepFeatureExtractor (through SemanticSegmentor.filter_coordinates())Β #901

@GeorgeBatch

Description

@GeorgeBatch
  • TIA Toolbox version: 1.6.0
  • Python version: 3.11
  • Operating System: Linux

Description

Existing functionality: When using SlidingWindowPatchExtractor, we can specify min_mask_ratio to filter out patches with little tissue.

Proposed feature: When computing features with DeepFeatureExtractor, we rely on the SemanticSegmentor.filter_coordinates() method, which, as far as I understand based on the definition of sel_func(), needs just one pixel from the patch to be within the mask for the patch to be selected.

@staticmethod
def filter_coordinates(
mask_reader: VirtualWSIReader,
bounds: np.ndarray,
resolution: Resolution | None = None,
units: Units | None = None,
) -> np.ndarray:
"""Indicates which coordinate is valid basing on the mask.
To use your own approaches, either subclass to overwrite or
directly assign your own function to this name. In either cases,
the function must obey the API defined here.
Args:
mask_reader (:class:`.VirtualReader`):
A virtual pyramidal reader of the mask related to the
WSI from which we want to extract the patches.
bounds (ndarray and np.int32):
Coordinates to be checked via the `func`. They must be
in the same resolution as requested `resolution` and
`units`. The shape of `coordinates` is (N, K) where N is
the number of coordinate sets and K is either 2 for
centroids or 4 for bounding boxes. When using the
default `func=None`, K should be 4, as we expect the
`coordinates` to be bounding boxes in `[start_x,
start_y, end_x, end_y]` format.
resolution (Resolution):
Resolution of the requested patch.
units (Units):
Units of the requested patch.
Returns:
:class:`numpy.ndarray`:
List of flags to indicate which coordinate is valid.
Examples:
>>> # API of function expected to overwrite `filter_coordinates`
>>> def func(reader, bounds, resolution, units):
... # as example, only select first bound
... return np.array([1, 0])
>>> coords = [[0, 0, 256, 256], [128, 128, 384, 384]]
>>> segmentor = SemanticSegmentor(model='unet')
>>> segmentor.filter_coordinates = func
"""
if not isinstance(mask_reader, VirtualWSIReader):
msg = "`mask_reader` should be VirtualWSIReader."
raise TypeError(msg)
if not isinstance(bounds, np.ndarray) or not np.issubdtype(
bounds.dtype,
np.integer,
):
msg = "`coordinates` should be ndarray of integer type."
raise ValueError(msg)
mask_real_shape = mask_reader.img.shape[:2]
mask_resolution_shape = mask_reader.slide_dimensions(
resolution=resolution,
units=units,
)[::-1]
mask_real_shape = np.array(mask_real_shape)
mask_resolution_shape = np.array(mask_resolution_shape)
scale_factor = mask_real_shape / mask_resolution_shape
scale_factor = scale_factor[0] # what if ratio x != y
def sel_func(coord: np.ndarray) -> bool:
"""Accept coord as long as its box contains part of mask."""
coord_in_real_mask = np.ceil(scale_factor * coord).astype(np.int32)
start_x, start_y, end_x, end_y = coord_in_real_mask
roi = mask_reader.img[start_y:end_y, start_x:end_x]
return np.sum(roi > 0) > 0
flags = [sel_func(bound) for bound in bounds]
return np.array(flags)

Benefit: Adding min_mask_ratio can significantly speed up the feature computation by filtering out patches that mostly contain background pixels in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions