Skip to content

Commit a6d0418

Browse files
authored
POL5826 un-hounsfelding 2D montage (PolusAI#297)
1 parent d6c9cd5 commit a6d0418

File tree

5 files changed

+245
-79
lines changed

5 files changed

+245
-79
lines changed

README.md

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Nyxus is a feature-rich, optimized, Python/C++ application capable of analyzing
1515

1616
Nyxus can be used via Python or command line and is available in containerized form for reproducible execution. Nyxus computes over 450 combined intensity, texture, and morphological features at the ROI or whole image level with more in development. Key features that make Nyxus unique among other image feature extraction applications is its ability to operate at any scale, its highly validated algorithms, and its modular nature that makes the addition of new features straightforward.
1717

18-
Currently, Nyxus can read image data from OME-TIFF, OME-Zarr and DICOM 2D Grayscale images. It also has a Python API to support in-memory image data via Numpy array.
18+
Currently, Nyxus can read 2D image data from OME-TIFF, OME-Zarr, and DICOM 2D Grayscale images. Nyxus also reads compressed and uncompressed NIFTI 3D files. Nyxus Python API supports featurizing in-memory 2D image data represented by NumPy arrays.
1919

2020
The docs can be found at [Read the Docs](https://nyxus.readthedocs.io/en/latest/).
2121

@@ -40,6 +40,8 @@ The library provides class `Nyxus` for 2-dimensional TIFF, OME.TIFF, OME.ZARR, a
4040

4141
Given `intensities` and `labels` folders, Nyxus pairs up intensity-segmentation mask images and extracts features from all of them. A summary of the available feature are [listed below](#available-features).
4242

43+
#### featurizing data in file system directories
44+
4345
```python
4446
from nyxus import Nyxus
4547
nyx = Nyxus (["*ALL*"])
@@ -48,6 +50,7 @@ maskDir = "/path/to/images/labels/"
4850
features = nyx.featurize_directory (intensityDir, maskDir) # selecting all the .ome.tif slides (default)
4951
```
5052

53+
#### featurizing explicitly defined lists of files
5154
Alternatively, Nyxus can process explicitly defined pairs of intensity-mask images thus specifying custom 1:N and M:N mapping between segmentation mask and intensity image files.
5255
The following example extracts all the features (note parameter "*ALL*") from intensity images 'i1', 'i2', and 'i3' related with mask images 'm1' and 'm2' via a custom mapping:
5356

@@ -68,7 +71,7 @@ features = nyx.featurize_files(
6871
False) # pass True to featurize intensity files as whole segments
6972
```
7073

71-
The result `features` variable is a Pandas dataframe similar to what is shown below. Note that if multiple segments are stored in a segmentation mask file, each segment's features in the resultcan be identified by the mask file name and segment mask label.
74+
The result variable `features` is a Pandas dataframe similar to what is shown below. Note that if multiple segments are stored in a segmentation mask file, each segment's features in the resultcan be identified by the mask file name and segment mask label.
7275

7376
| | mask_image | intensity_image | label | MEAN | MEDIAN |...| GABOR_6 |
7477
|----:|:---------------------|:---------------------|--------:|--------:|---------:|--:|-----------:|
@@ -80,7 +83,9 @@ The result `features` variable is a Pandas dataframe similar to what is shown be
8083
| ... | ... | ... | ... | ... | ... |...| ... |
8184
| 734 | p5_y0_r51_c0.ome.tif | p5_y0_r51_c0.ome.tif | 223 | 54573.3 | 54573.3 |...| 0.980769 |
8285

83-
Nyxus can also process intensity-mask pairs that are loaded as NumPy arrays using the `featurize` method. This method takes in either a single pair of 2D intensity-mask pairs
86+
#### featurizing in-memory 2D images; featurizing a montage
87+
88+
Nyxus can also featurize in-memory intensity-mask pairs that are loaded as NumPy arrays using the `featurize` method. This method takes in either a single pair of 2D intensity-mask pairs
8489
or a pair of 3D arrays containing 2D intensity and mask images. There is also two optional parameters to supply names to the resulting dataframe, .
8590

8691
```python
@@ -106,6 +111,34 @@ seg = np.array([
106111
features = nyx.featurize(intens, seg)
107112
```
108113

114+
<u>Note:</u> if array `intens` contains negative values similarly to intensities in Hounsfeld units observed in CT-scan datasets, method `featurize()` automatically adjusts values of of array `intens` while passing them to Nyxus backend so as to make them zero-based, like in the following example:
115+
116+
```
117+
import numpy as np
118+
import nyxus
119+
I = np.array([
120+
[-1024.74, -1019.67, -1005.70, -998.60, -998.66, -1005.82, -1019.65, -1024.72],
121+
[-1019.44, -1001.22, -1023.82, -1034.34, -1035.81, -1027.00, -1001.89, -1019.62],
122+
[-1011.86, -1002.17, -724.06, -521.43, -471.04, -671.30, -1006.98, -1010.62],
123+
[-1008.78, -703.58, 21.66, 44.32, 130.35, 113.37, -608.11, -1056.33],
124+
[-415.46, -106.08, 69.80, 59.70, 97.64, 120.62, -49.77, -480.57],
125+
[-464.06, -176.81, 76.79, 93.34, 131.99, 73.16, -106.70, -348.06],
126+
[-1012.75, -740.21, -502.72, -370.36, -377.42, -497.65, -719.82, -1000.82],
127+
[-1032.57, -979.63, -867.71, -815.30, -830.90, -875.08, -983.76, -1033.78]], np.float64)
128+
M = np.array([
129+
[0, 0, 0, 0, 0, 0, 0, 0],
130+
[0, 0, 1, 1, 1, 1, 0, 0],
131+
[0, 1, 1, 1, 1, 1, 1, 0],
132+
[0, 1, 1, 1, 1, 1, 1, 0],
133+
[0, 1, 1, 1, 1, 1, 1, 0],
134+
[0, 1, 1, 1, 1, 1, 1, 0],
135+
[0, 0, 1, 1, 1, 1, 0, 0],
136+
[0, 0, 0, 0, 0, 0, 0, 0]], np.uint16)
137+
nyx = nyxus.Nyxus (["*ALL_INTENSITY*"])
138+
f = nyx.featurize (I, M, intensity_names=['I'], label_names=['M'])
139+
```
140+
141+
109142
The `features` variable is a Pandas dataframe similar to what is shown below.
110143

111144
| | mask_image | intensity_image | label | MEAN | MEDIAN |...| GABOR_6 |

src/nyx/features/contour.cpp

Lines changed: 111 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -374,13 +374,64 @@ void ContourFeature::buildRegularContour(LR& r)
374374
r.contour.clear();
375375

376376
// gather contour pixels undecorating their intensities back to original values
377+
Pixel2 lastNonzeroPx (0, 0, 0);
377378
for (int y = 0; y < height + 2; y++)
378379
for (int x = 0; x < width + 2; x++)
379380
{
380381
size_t idx = x + y * (width + 2);
381382
auto inte = borderImage.at(idx);
382383
if (inte)
383384
{
385+
// this pixel may happen to be isolated (a speckle), nonetheless, remember it
386+
// as we'll need to report it as a degenerate contour if no properly neighbored
387+
// pixel group is found
388+
lastNonzeroPx = { x, y, inte - 1 };
389+
390+
// register a pixel only if it has any immediate neighbor
391+
bool hasNeig = false;
392+
if (x > 0) // left neighbor
393+
{
394+
size_t idxNeig = (x-1) + y * (width+2);
395+
hasNeig = hasNeig || borderImage.at(idxNeig) != 0;
396+
}
397+
if (x < width-1) // right neighbor
398+
{
399+
size_t idxNeig = (x+1) + y * (width+2);
400+
hasNeig = hasNeig || borderImage.at(idxNeig) != 0;
401+
}
402+
if (y > 0) // upper neighbor
403+
{
404+
size_t idxNeig = x + (y-1) * (width+2);
405+
hasNeig = hasNeig || borderImage.at(idxNeig) != 0;
406+
}
407+
if (y < height-1) // lower neighbor
408+
{
409+
size_t idxNeig = x + (y+1) * (width+2);
410+
hasNeig = hasNeig || borderImage.at(idxNeig) != 0;
411+
}
412+
if (x>0 && y > 0) // upper left neighbor
413+
{
414+
size_t idxNeig = (x-1) + (y-1) * (width+2);
415+
hasNeig = hasNeig || borderImage.at(idxNeig) != 0;
416+
}
417+
if (x < width-1 && y > 0) // upper right neighbor
418+
{
419+
size_t idxNeig = (x+1) + (y-1) * (width+2);
420+
hasNeig = hasNeig || borderImage.at(idxNeig) != 0;
421+
}
422+
if (x>0 && y < height-1) // lower left neighbor
423+
{
424+
size_t idxNeig = (x-1) + (y+1) * (width+2);
425+
hasNeig = hasNeig || borderImage.at(idxNeig) != 0;
426+
}
427+
if (x < width-1 && y < height-1) // lower right neighbor
428+
{
429+
size_t idxNeig = (x+1) + (y+1) * (width+2);
430+
hasNeig = hasNeig || borderImage.at(idxNeig) != 0;
431+
}
432+
if (!hasNeig)
433+
continue;
434+
// pixel is good, save it
384435
Pixel2 p(x, y, inte - 1);
385436
r.contour.push_back(p);
386437
}
@@ -390,65 +441,71 @@ void ContourFeature::buildRegularContour(LR& r)
390441

391442
//==== Reorder the contour cloud
392443

393-
// --containers for unordered (temp) and ordered (result) pixels
394-
std::list<Pixel2> unordered(r.contour.begin(), r.contour.end());
395-
std::vector<Pixel2> ordered;
396-
ordered.reserve(unordered.size());
397-
std::vector<Pixel2> pants;
398-
399-
// --initialize vector 'ordered' with 1st pixel of 'unordered'
400-
auto itBeg = unordered.begin();
401-
Pixel2 pxTip = *itBeg;
402-
ordered.push_back(pxTip);
403-
unordered.remove(pxTip);
404-
405-
// -- tip of the ordered contour
406-
pxTip = ordered.at(0);
407-
408-
// -- harvest items of 'unordered'
409-
while (unordered.size())
444+
// are there any good candidate pixels ?
445+
if (r.contour.size())
410446
{
411-
// --find tip's neighbors
412-
std::vector<Pixel2> cands = find_cands (unordered, pxTip);
413-
if (cands.empty())
447+
// --containers for unordered (temp) and ordered (result) pixels
448+
std::list<Pixel2> unordered (r.contour.begin(), r.contour.end());
449+
std::vector<Pixel2> ordered;
450+
ordered.reserve (unordered.size());
451+
std::vector<Pixel2> pants;
452+
453+
// --initialize vector 'ordered' with 1st pixel of 'unordered'
454+
auto itBeg = unordered.begin();
455+
Pixel2 pxTip = *itBeg;
456+
ordered.push_back(pxTip);
457+
unordered.remove(pxTip);
458+
459+
// -- tip of the ordered contour
460+
pxTip = ordered.at(0);
461+
462+
// -- harvest items of 'unordered'
463+
while (unordered.size())
414464
{
415-
// -- we have a gap and need to fix it
416-
VERBOSLVL4(dump_2d_image_with_halfcontour(borderImage, unordered, ordered, pxTip, width + 2, height + 2, "\nhalfcontour:\n", ""));
417-
418-
// -- no 'break;' ,instead, jump the tip to the closest U-pixel
419-
Pixel2 pxPants;
420-
pxPants = pants.back();
421-
pxTip = pxPants;
422-
Pixel2 closest = find_closest (unordered, pxTip);
423-
424-
// -- discharge
425-
ordered.push_back (closest);
426-
unordered.remove (closest);
427-
pxTip = ordered.at(ordered.size() - 1);
465+
// --find tip's neighbors
466+
std::vector<Pixel2> cands = find_cands(unordered, pxTip);
467+
if (cands.empty())
468+
{
469+
// -- we have a gap and need to fix it
470+
VERBOSLVL4(dump_2d_image_with_halfcontour(borderImage, unordered, ordered, pxTip, width + 2, height + 2, "\nhalfcontour:\n", ""));
471+
472+
// -- no 'break;' ,instead, jump the tip to the closest U-pixel
473+
Pixel2 pxPants;
474+
pxPants = pants.back();
475+
pxTip = pxPants;
476+
Pixel2 closest = find_closest(unordered, pxTip);
477+
478+
// -- discharge
479+
ordered.push_back(closest);
480+
unordered.remove(closest);
481+
pxTip = ordered.at(ordered.size() - 1);
482+
}
483+
else
484+
{
485+
// -- register pants
486+
if (cands.size() >= 2)
487+
pants.push_back(pxTip);
488+
489+
// -- score thems
490+
std::vector<int> candScores = score_cands(cands, pxTip);
491+
492+
// -- choose the best
493+
auto itBest = std::min_element(candScores.begin(), candScores.end());
494+
int idxBest = (int)std::distance(candScores.begin(), itBest);
495+
496+
// -- discharge the found pixel from set 'unordered' and update the tip
497+
Pixel2& px = cands.at(idxBest);
498+
ordered.push_back(px);
499+
unordered.remove(px);
500+
pxTip = ordered.at(ordered.size() - 1);
501+
}
428502
}
429-
else
430-
{
431-
// -- register pants
432-
if (cands.size() >= 2)
433-
pants.push_back(pxTip);
434503

435-
// -- score thems
436-
std::vector<int> candScores = score_cands(cands, pxTip);
437-
438-
// -- choose the best
439-
auto itBest = std::min_element (candScores.begin(), candScores.end());
440-
int idxBest = (int)std::distance (candScores.begin(), itBest);
441-
442-
// -- discharge the found pixel from set 'unordered' and update the tip
443-
Pixel2& px = cands.at(idxBest);
444-
ordered.push_back(px);
445-
unordered.remove(px);
446-
pxTip = ordered.at(ordered.size() - 1);
447-
}
504+
// done sorting. Now set the ordered contour in the ROI
505+
r.contour = ordered;
448506
}
449-
450-
// done sorting. Now set the ordered contour in the ROI
451-
r.contour = ordered;
507+
else
508+
r.contour.push_back(lastNonzeroPx); // just use the last speckle as a contour because we have no legit contour
452509

453510
VERBOSLVL4(dump_2d_image_with_vertex_chain(borderImage, r.contour, width + 2, height + 2, "\n\n-- ContourFeature / buildRegularContour / Padded contour image + sorted contour--\n", "\n\n"));
454511

src/nyx/python/nyxus/nyxus.py

Lines changed: 30 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -291,9 +291,9 @@ def featurize_directory(
291291
axis=1,
292292
)
293293

294-
# Labels should always be uint.
294+
# Labels should always be uint
295295
if "ROI_label" in df.columns:
296-
df["ROI_label"] = df.ROI_label.astype(np.uint32)
296+
df.ROI_label = df.ROI_label.astype(np.uint32)
297297

298298
return df
299299

@@ -399,13 +399,24 @@ def featurize(
399399
if (label_images.shape[0] != len(label_names)):
400400
raise ValueError("Number of segmentation names must be the same as the number of images.")
401401

402+
# (1) check if the intensities are represented in Hounsfeld type scale, and (2) adjust them into the unsigned integer scale
403+
I = intensity_images
404+
min_raw_I = np.min (intensity_images)
405+
if (min_raw_I < 0):
406+
I -= min_raw_I
407+
if (not isinstance(I.flat[0], np.uint32)):
408+
I = I.astype (np.uint32)
409+
410+
# cast mask data to unsigned integer, too
411+
M = label_images.astype (np.uint32)
412+
413+
# featurize
402414
if (output_type == 'pandas'):
403415

404-
header, string_data, numeric_data, error_message = featurize_montage_imp (intensity_images, label_images, intensity_names, label_names, output_type, "")
405-
416+
header, string_data, numeric_data, error_message = featurize_montage_imp (I, M, intensity_names, label_names, output_type, "")
406417
self.error_message = error_message
407-
if(error_message != ''):
408-
print(error_message)
418+
if error_message != '':
419+
print (error_message)
409420

410421
df = pd.concat(
411422
[
@@ -415,18 +426,16 @@ def featurize(
415426
axis=1,
416427
)
417428

418-
# Labels should always be uint.
419-
if "label" in df.columns:
420-
df["label"] = df.label.astype(np.uint32)
429+
# labels should always be uint
430+
if "ROI_label" in df.columns:
431+
df.ROI_label = df.ROI_label.astype(np.uint32)
421432

422433
return df
423434

424435
else:
425436

426-
ret = featurize_montage_imp (intensity_images, label_images, intensity_names, label_names, output_type, output_path)
427-
437+
ret = featurize_montage_imp (I, M, intensity_names, label_names, output_type, output_path)
428438
self.error_message = ret[0]
429-
430439
if(self.error_message != ''):
431440
raise RuntimeError('Error calculating features: ' + error_message[0])
432441

@@ -494,9 +503,9 @@ def featurize_files (
494503
axis=1,
495504
)
496505

497-
# Labels should always be uint.
498-
if "label" in df.columns:
499-
df["label"] = df.label.astype(np.uint32)
506+
# Labels should always be uint
507+
if "ROI_label" in df.columns:
508+
df.ROI_label = df.ROI_label.astype(np.uint32)
500509

501510
return df
502511

@@ -1057,9 +1066,9 @@ def featurize_directory(
10571066
],
10581067
axis=1,
10591068
)
1060-
# Labels should always be uint.
1061-
if "label" in df.columns:
1062-
df["label"] = df.label.astype(np.uint32)
1069+
# Labels should always be uint
1070+
if "ROI_label" in df.columns:
1071+
df.ROI_label = df.ROI_label.astype(np.uint32)
10631072
return df
10641073
else:
10651074
featurize_directory_3D_imp (intensity_dir, label_dir, file_pattern, output_type, output_path)
@@ -1124,9 +1133,9 @@ def featurize_files (
11241133
axis=1,
11251134
)
11261135

1127-
# Labels should always be uint.
1128-
if "label" in df.columns:
1129-
df["label"] = df.label.astype(np.uint32)
1136+
# Labels should always be uint
1137+
if "ROI_label" in df.columns:
1138+
df.ROI_label = df.ROI_label.astype(np.uint32)
11301139

11311140
return df
11321141

0 commit comments

Comments
 (0)