⚡ Improve `Engine` Performance and Implementation #578

shaneahmed · 2023-03-31T09:55:58Z

Improve Engines performance and implementation
Redesigns PatchPredictor engine using the new EngineABC base class.
The WSIs are now processed using the same code as for the processing the patches using WSI based dataloader.
The intermediate output is saved as zarr for the WSIs to resolve memory issues.
The output of model architectures should now be a dictionary.
The output can be specified as AnnotationStore for visualisation using TIAViz.
Fix mypy Type Checks for cli/common.py
Redesigns PatchPredictor engine using the new EngineABC base class.
The WSIs are now processed using the same code as for the processing the patches using WSI based dataloader.
The intermediate output is saved as zarr for the WSIs to resolve memory issues.
The output of model architectures should now be a dictionary.
The output can be specified as AnnotationStore for visualisation using TIAViz.
Add PatchPredictor Engine based on EngineABC
Add return_probabilities option to Params
Removes merge_predictions option in PatchPredictor engine.
Defines post_process_cache_mode which allows running the algorithm on WSI
Add infer_wsi for WSI inference
Removes save_wsi_output as this is not required after post processing.
Removes merge_predictions and fixes docstring in EngineABCRunParams
compile_model is now moved to EngineABC init
Fixes bug with _calculate_scale_factor
Fixes a bug in class_dict definition.
_get_zarr_array is now a public function get_zarr_array in misc
patch_predictions_as_annotations runs the loop on patch_coords instead of class_probs

- Use `pyproject.toml` for `bdist_wheel` configuration

…-abc

- Improve `Engines` performance and implementation

codecov · 2023-03-31T10:31:07Z

Codecov Report

❌ Patch coverage is 89.67001% with 72 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.72%. Comparing base (adc18c9) to head (b542c9a).

Files with missing lines	Patch %	Lines
tiatoolbox/models/dataset/dataset_abc.py	73.97%	38 Missing ⚠️
tiatoolbox/models/engine/io_config.py	56.75%	32 Missing ⚠️
tiatoolbox/cli/nucleus_instance_segment.py	66.66%	1 Missing ⚠️
tiatoolbox/utils/misc.py	97.77%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #578      +/-   ##
===========================================
- Coverage    99.27%   94.72%   -4.56%     
===========================================
  Files           71       73       +2     
  Lines         9162     9235      +73     
  Branches      1195     1208      +13     
===========================================
- Hits          9096     8748     -348     
- Misses          40      452     +412     
- Partials        26       35       +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Refactor engines_abc.py

for more information, see https://pre-commit.ci

# Conflicts: # tiatoolbox/utils/misc.py

# Conflicts: # tests/models/test_feature_extractor.py # tiatoolbox/models/models_abc.py

# Conflicts: # tiatoolbox/cli/common.py # tiatoolbox/cli/nucleus_instance_segment.py # tiatoolbox/cli/patch_predictor.py # tiatoolbox/models/engine/semantic_segmentor.py

* ⚡ Make WSIPatchDataset Pickleable to Support Windows Multithreading (#947) This PR makes the WSIPatchDataset class picklable by delaying the creation of the reader object until the first call to `__getitem__`. This enables the use of multiple loader workers on Windows without errors and provides significant performance improvements. - Delays reader object instantiation to the first `__getitem__` call instead of during initialization - Extracts reader creation logic into a separate `_get_reader` method - Stores image path and mode as instance variables for lazy initialization Speedup for the WSI prediction cell of the patch_prediction example notebook: 2min 48 sec with 0 loader workers -> 1min 13 sec with 4 workers. Note: this PR doesn't have any effect for Linux as the multi-threading already works fine there because Linux multithreading doesn't require things to be pickleable * 🔀 Merge branch develop into dev-engine-abc * 🐛 Fix reader_info read --------- Co-authored-by: Mark Eastwood <[email protected]>

# Conflicts: # tiatoolbox/models/dataset/classification.py

# Conflicts: # tests/models/test_patch_predictor.py

for more information, see https://pre-commit.ci

# Conflicts: # tests/models/test_feature_extractor.py # tests/models/test_multi_task_segmentor.py # tests/models/test_nucleus_instance_segmentor.py # tests/models/test_patch_predictor.py # tests/models/test_semantic_segmentation.py # tiatoolbox/models/architecture/__init__.py

## Summary of Changes ### Major Additions - **Dask Integration:** - Added `dask` as a dependency and integrated Dask arrays and lazy computation throughout the engine and patch predictor code. - Added Dask-based merging, chunking, and memory-aware processing for large images and WSIs. - **Zarr Output Support:** - Added support for saving model predictions and intermediate results directly to Zarr format. - New CLI options and internal logic for Zarr output, including memory thresholding and chunked writes. - **SemanticSegmentor Engine:** - Added a new `SemanticSegmentor` engine with Dask/Zarr support and new test coverage (`test_semantic_segmentor.py`). - Added CLI entrypoint for `semantic_segmentor` and removed the old `semantic_segment` CLI. - **Enhanced CLI and Config:** - Added CLI options for memory threshold, unified worker options, and improved mask handling. - Updated YAML configs and sample data for new models and test images. - **Utilities and Validation:** - Added utility functions for minimal dtype casting, patch/stride validation, and improved error handling (e.g., `DimensionMismatchError`). - Improved annotation store conversion for Dask arrays and Zarr-backed outputs. - **Changes to `kwarg`** - Add `memory-threshold` - Unified `num-loader-workers` and `num-postproc-workers` into `num-workers` - Removed `cache_mode` as cache mode is automatically handled. --- ### Major Removals/Refactors - **Removed Old CLI and Redundant Code:** - Deleted the old `semantic_segment.py` CLI and replaced it with `semantic_segmentor.py`. - Removed legacy cache mode and patch prediction Zarr store tests. - **Refactored Model and Dataset APIs:** - Unified and simplified model inference APIs to always return arrays (not dicts) for batch outputs. - Refactored dataset classes to enforce patch shape validation and remove legacy “mode” logic. - **Test Cleanup:** - Removed or updated tests that relied on old APIs or cache mode. - Refactored test assertions for new output types and Dask array handling. - **API Consistency:** - Standardized function and argument names across engines, CLI, and utility modules. - Updated docstrings and type hints for clarity and consistency. --- ### Notable File Changes - **New:** - `tiatoolbox/cli/semantic_segmentor.py` - `tests/engines/test_semantic_segmentor.py` - **Removed:** - `tiatoolbox/cli/semantic_segment.py` - Old cache mode and patch Zarr store tests - **Heavily Modified:** - `engine_abc.py`, `patch_predictor.py`, `semantic_segmentor.py` - CLI modules and test suites - Dataset and utility modules for Dask/Zarr compatibility --- ### Impact - Enables scalable, parallel, and memory-efficient inference and output saving for large images. - Simplifies downstream analysis by supporting Zarr as a native output format. - Lays the groundwork for further Dask-based optimizations in TIAToolbox. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

shaneahmed added 3 commits March 24, 2023 11:18

🔧 Use pyproject.toml for bdist_wheel configuration

ef55e95

- Use `pyproject.toml` for `bdist_wheel` configuration

Merge remote-tracking branch 'origin/develop' into dev-define-engines…

49a0624

…-abc

⚡ Improve Engines performance and implementation

8ba6def

- Improve `Engines` performance and implementation

shaneahmed self-assigned this Mar 31, 2023

shaneahmed added the enhancement New feature or request label Mar 31, 2023

Merge branch 'develop' into dev-define-engines-abc

5cbcfcf

shaneahmed added this to the Release v2.0.0 milestone Apr 10, 2023

shaneahmed mentioned this pull request Apr 19, 2023

🩹 Support for np.ndarray and WSIReader in PatchPredictor #576

Closed

♻️ Refactor engines_abc.py

fac1000

- Refactor engines_abc.py

shaneahmed changed the title ~~⚡ Improve Engines Performance and Implementation~~ ⚡ Improve Engine Performance and Implementation Apr 28, 2023

shaneahmed added 9 commits May 5, 2023 22:17

Merge branch 'develop' into dev-define-engines-abc

a72d9ba

Merge branch 'develop' into dev-define-engines-abc

57ea44a

Merge branch 'develop' into dev-define-engines-abc

6618161

Merge branch 'develop' into dev-define-engines-abc

6996764

Merge branch 'develop' into dev-define-engines-abc

3584f6c

Merge branch 'develop' into dev-define-engines-abc

eada692

Merge branch 'develop' into dev-define-engines-abc

77f1992

Merge branch 'develop' into dev-define-engines-abc

a477d32

Merge branch 'develop' into dev-define-engines-abc

f3e33b9

shaneahmed linked an issue Jul 14, 2023 that may be closed by this pull request

Shifted patches when merging patch predictions! #634

Open

shaneahmed mentioned this pull request Jul 14, 2023

Shifted patches when merging patch predictions! #634

Open

shaneahmed and others added 8 commits July 21, 2023 17:17

Merge branch 'develop' into dev-define-engines-abc

7d35285

Merge branch 'develop' into dev-define-engines-abc

7bad284

[pre-commit.ci] auto fixes from pre-commit.com hooks

36fd629

for more information, see https://pre-commit.ci

Merge branch 'develop' into dev-define-engines-abc

443141c

Merge branch 'develop' into dev-define-engines-abc

b9d8c38

[pre-commit.ci] auto fixes from pre-commit.com hooks

e608f7b

for more information, see https://pre-commit.ci

Merge branch 'develop' into dev-define-engines-abc

1d7f5c0

[pre-commit.ci] auto fixes from pre-commit.com hooks

b956bf5

for more information, see https://pre-commit.ci

pre-commit-ci bot and others added 30 commits April 10, 2025 10:33

[pre-commit.ci] auto fixes from pre-commit.com hooks

34b3204

for more information, see https://pre-commit.ci

Merge branch 'develop' into dev-define-engines-abc

37775db

Merge branch 'develop' into dev-define-engines-abc

d29737e

Merge branch 'develop' into dev-define-engines-abc

569d9ec

Merge branch 'develop' into dev-define-engines-abc

a4bf97b

# Conflicts: # tiatoolbox/utils/misc.py

Merge branch 'develop' into dev-define-engines-abc

ff0bb20

Merge branch 'develop' into dev-define-engines-abc

7737c1b

Merge branch 'develop' into dev-define-engines-abc

51fbfa8

Merge branch 'develop' into dev-define-engines-abc

7998c03

# Conflicts: # tests/models/test_feature_extractor.py # tiatoolbox/models/models_abc.py

Merge branch 'develop' into dev-define-engines-abc

6f6cb33

🔀 Merge branch 'develop' into dev-define-engines-abc

1edc2b3

# Conflicts: # tiatoolbox/cli/common.py # tiatoolbox/cli/nucleus_instance_segment.py # tiatoolbox/cli/patch_predictor.py # tiatoolbox/models/engine/semantic_segmentor.py

🐛 Fix FBT001

06a9cb0

🐛 Fix mypy checks

f7abbe8

Merge branch 'develop' into dev-define-engines-abc

616fb84

Merge branch 'develop' into dev-define-engines-abc

d2381c0

# Conflicts: # tiatoolbox/models/dataset/classification.py

Merge branch 'develop' into dev-define-engines-abc

283bd22

Merge branch 'develop' into dev-define-engines-abc

56867fc

Merge branch 'develop' into dev-define-engines-abc

f36abe4

Merge branch 'develop' into dev-define-engines-abc

72d3474

# Conflicts: # tests/models/test_patch_predictor.py

🐛 Fix Use a raw string or re.escape() to make the intention explicit

ca49b18

🔀 Merge develop into dev-engine-abc

e3520ba

[pre-commit.ci] auto fixes from pre-commit.com hooks

efdbf4f

for more information, see https://pre-commit.ci

🐛 Fix ruff error

2f1ca4a

[pre-commit.ci] auto fixes from pre-commit.com hooks

cf36794

for more information, see https://pre-commit.ci

🔥 Remove redundant import

050986f

✅ Update tests to use track_tmp_path for clean up

f5a4c35

Merge branch 'develop' into dev-define-engines-abc

31b7995

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡ Improve `Engine` Performance and Implementation #578

⚡ Improve `Engine` Performance and Implementation #578

Uh oh!

shaneahmed commented Mar 31, 2023 •

edited

Loading

Uh oh!

codecov bot commented Mar 31, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

⚡ Improve Engine Performance and Implementation #578

Are you sure you want to change the base?

⚡ Improve Engine Performance and Implementation #578

Uh oh!

Conversation

shaneahmed commented Mar 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

⚡ Improve `Engine` Performance and Implementation #578

⚡ Improve `Engine` Performance and Implementation #578

shaneahmed commented Mar 31, 2023 •

edited

Loading

codecov bot commented Mar 31, 2023 •

edited

Loading