morphology: add h_maxima, h_minima, local_maxima, local_minima. improve reconstruct#1061
Open
grlee77 wants to merge 5 commits intorapidsai:mainfrom
Open
morphology: add h_maxima, h_minima, local_maxima, local_minima. improve reconstruct#1061grlee77 wants to merge 5 commits intorapidsai:mainfrom
h_maxima, h_minima, local_maxima, local_minima. improve reconstruct#1061grlee77 wants to merge 5 commits intorapidsai:mainfrom
Conversation
This is not a port of the scikit-image Cython code, but an independent implementation using already implemented GPU kernels.
These rely on cucim.skimage.morphology.grayreconstruct which is still largely CPU-based, so acceleration is only around 2x vs. scikit-image for these two functions.
…on and make it the default
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
local_minima and local_maxima
The
local_minimaandlocal_maximafunctions are an implementation via a different GPU algorithm that gives equivalent results to the scikit-image Cython code.Side-by-Side Comparison (CPU vs. GPU algorithm)
maximum_filter(bulk)ndi.label(connected components)local_minimainvert(image, signed_float=True)thenlocal_maximainvert(image, signed_float=True)thenlocal_maximah_minima, h_maxima and reconstruction
The
h_maximaandh_minimafunctions are trivial conversions of the corresponding scikit-image functions. The core of these is thecucim.skimage.morphology.reconstructalgorithm. I found thatreconstructstill relied on CPU fallback for the core operation, so was still relatively inefficient. I implemented a simple iterative approach on the GPU that is much faster for typical cases, but would be slower in some pathological cases. There is a new keyword-only argument that allows optionally using the older CPU-based implementation, but the GPU case is now used by default.Benchmarks
local_maxima benchmarks
h_maxima benchmarks (reconstruct_on_cpu=True)
h_maxima benchmarks (reconstruct_on_cpu=False)
Benchmark script
```py """Benchmark local_maxima: scikit-image (CPU) vs cucim.skimage (GPU)."""import math
import time
import cupy as cp
import numpy as np
from cupyx.profiler import benchmark
from skimage.morphology import local_maxima as local_maxima_cpu
from skimage.morphology import h_maxima as h_maxima_cpu
from cucim.skimage.morphology import local_maxima as local_maxima_gpu
from cucim.skimage.morphology import h_maxima as h_maxima_gpu
---------------------------------------------------------------------------
Test configurations
---------------------------------------------------------------------------
shapes_2d = [
(512, 512),
(1024, 1024),
(2048, 2048),
(4096, 4096),
]
shapes_3d = [
(64, 64, 64),
(128, 128, 128),
(256, 256, 256),
]
dtypes = [np.uint8, np.float32, np.float64]
connectivities_2d = [1, 2] # 4-conn and 8-conn
connectivities_3d = [1, 3] # 6-conn and 26-conn
def make_image(shape, dtype, rng_seed=42):
"""Create a test image with some local structure."""
rng = np.random.default_rng(rng_seed)
if np.issubdtype(dtype, np.integer):
img = rng.integers(0, 256, size=shape, dtype=dtype)
else:
img = rng.standard_normal(shape).astype(dtype)
return img
def run_benchmarks():
header = (
"| shape | dtype | connectivity | allow_borders "
"| CPU (ms) | GPU (ms) | speedup |"
)
sep = (
"|-------|-------|--------------|---------------"
"|----------|----------|---------|"
)
def _h_values_for_dtype(dtype):
"""Return h values that exercise the real code path for the given dtype.
def run_h_maxima_benchmarks():
header = (
"| shape | dtype | h "
"| CPU (ms) | GPU (ms) | speedup |"
)
sep = (
"|-------|-------|------"
"|----------|----------|---------|"
)
if name == "main":
run_benchmarks()
run_h_maxima_benchmarks()