Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the reader module by introducing a registry-based system to manage slide image readers, replacing the previous utility functions in utils.py. The refactoring adds a centralized ReaderRegistry class with automatic reader discovery and makes ReaderBase an abstract base class with formal abstract methods.
Key changes:
- Introduced
ReaderRegistryclass with decorator-based reader registration - Replaced utility functions (
try_reader,get_reader) with registry-basedREADERS.try_open() - Enhanced
ReaderBasewith abstract method enforcement andis_available()class method - Added new
FastSlideReaderimplementation
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| wsidata/reader/utils.py | Removed entire file containing legacy reader utility functions |
| wsidata/reader/_reader_registry.py | New registry system with ReaderRegistry class and @register decorator |
| wsidata/reader/base.py | Enhanced with ABC inheritance, abstract methods, and is_available() method |
| wsidata/reader/openslide.py | Added @register decorator and pkg_namespaces attribute |
| wsidata/reader/tiffslide.py | Added @register decorator and pkg_namespaces attribute |
| wsidata/reader/cucim.py | Added @register decorator, name, and pkg_namespaces attributes |
| wsidata/reader/bioformats.py | Added @register decorator and pkg_namespaces attribute |
| wsidata/reader/fastslide.py | New reader implementation for FastSlide library |
| wsidata/reader/init.py | Updated imports to use READERS registry instead of utility functions |
| wsidata/io/_wsi.py | Updated to use READERS.try_open() instead of try_reader() |
| wsidata/init.py | Exported READERS instead of get_reader |
| docs/source/api/readers.rst | Added documentation for the new READERS registry usage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def try_open(self, img_path: str, reader: str = None) -> ReaderBase: | ||
| if reader is not None: | ||
| reader = self[reader] | ||
| reader.is_available(raise_error=True) | ||
| return reader(img_path) | ||
| for reader in self.priority: | ||
| reader = self[reader] | ||
| if reader.is_available(): | ||
| try: | ||
| return reader(img_path) | ||
| except Exception: # noqa | ||
| continue | ||
| raise ValueError(f"Cannot open image '{img_path}' using any of the readers.") |
There was a problem hiding this comment.
The new READERS registry and its try_open method lack direct test coverage. While the functionality is indirectly tested through open_wsi, consider adding tests that directly validate:
- Reader registration and retrieval
try_openbehavior with valid/invalid reader names- Fallback behavior when readers are unavailable
- The
is_availablemethod
| if isinstance(cls.pkg_namespaces, str): | ||
| cls.pkg_namespaces = [cls.pkg_namespaces] | ||
| for pkg in cls.pkg_namespaces: |
There was a problem hiding this comment.
The is_available method mutates the class attribute pkg_namespaces by converting it from a string to a list. This modification persists across all calls and could lead to unexpected behavior. Consider using a local variable instead:
@classmethod
def is_available(cls, raise_error=False):
try:
pkg_namespaces = cls.pkg_namespaces if isinstance(cls.pkg_namespaces, list) else [cls.pkg_namespaces]
for pkg in pkg_namespaces:
import_module(pkg)
except ModuleNotFoundError:
# ... rest of the code| if isinstance(cls.pkg_namespaces, str): | |
| cls.pkg_namespaces = [cls.pkg_namespaces] | |
| for pkg in cls.pkg_namespaces: | |
| pkg_namespaces = cls.pkg_namespaces if isinstance(cls.pkg_namespaces, list) else [cls.pkg_namespaces] | |
| for pkg in pkg_namespaces: |
|
|
||
|
|
There was a problem hiding this comment.
The FastSlideReader class is missing the @register decorator. Without registration, this reader won't be available in the READERS registry and cannot be used through the new registry system. Add the decorator:
from ._reader_registry import register
@register(name="fastslide")
class FastSlideReader(ReaderBase):| from ._reader_registry import register | |
| @register(name="fastslide") |
| for reader in self.priority: | ||
| reader = self[reader] | ||
| if reader.is_available(): | ||
| try: | ||
| return reader(img_path) |
There was a problem hiding this comment.
The try_open method will raise a KeyError if any reader name in the priority list is not registered. The loop at line 73-74 calls self[reader] which raises KeyError for missing readers. This prevents the method from trying subsequent readers in the priority list. Consider using self._readers.get(reader) with a None check or wrapping the lookup in a try-except block.
| for reader in self.priority: | |
| reader = self[reader] | |
| if reader.is_available(): | |
| try: | |
| return reader(img_path) | |
| for reader_name in self.priority: | |
| reader_cls = self._readers.get(reader_name) | |
| if reader_cls is None: | |
| continue | |
| if reader_cls.is_available(): | |
| try: | |
| return reader_cls(img_path) |
wsidata/reader/_reader_registry.py
Outdated
| def __repr__(self) -> str: | ||
| lines = [] | ||
| for name, available in self._repr_reader_order(): | ||
| lines.append(f"{name} ({'✓ Available' if available else '✗ Not Install'})") |
There was a problem hiding this comment.
Spelling error in the message: "Not Install" should be "Not Installed".
wsidata/reader/_reader_registry.py
Outdated
| for name, available in self._repr_reader_order(): | ||
| rows.append( | ||
| f"<tr><td style='text-align:left'>{escape(str(name))}</td>" | ||
| f"<td style='text-align:left'>{'✓ Available' if available else '✗ Not Install'}</td></tr>" |
There was a problem hiding this comment.
Spelling error in the message: "Not Install" should be "Not Installed".
|
|
||
| 1. create the reader | ||
| 2. assign the reader to self._reader | ||
| 2. use set_reader to set the reader |
There was a problem hiding this comment.
The documentation suggests using set_reader() to set the reader, but all existing implementations (OpenSlideReader, TiffSlideReader, etc.) directly assign to self._reader. Either update the documentation to reflect actual usage patterns or update all implementations to use the set_reader() method. For consistency, the documentation should say "assign the reader to self._reader" as it was before the refactor.
| 2. use set_reader to set the reader | |
| 2. assign the reader to self._reader |
|
|
||
| 1. Close the reader | ||
| 2. Set self._reader to None | ||
| 2. use set_reader to set the reader to None |
There was a problem hiding this comment.
The documentation suggests using set_reader() to set the reader to None, but all existing implementations (OpenSlideReader, TiffSlideReader, etc.) directly assign to self._reader. For consistency with the actual implementation pattern, the documentation should say "Set self._reader to None" as it was before the refactor.
| 2. use set_reader to set the reader to None | |
| 2. Set self._reader to None |
| def _repr_html_(self) -> str: | ||
| # HTML representation suitable for Jupyter rich display | ||
| rows = [ | ||
| "<table>\n<thead>\n<tr><th style='text-align:left'>Name</th>" |
There was a problem hiding this comment.
Implicit string concatenation. Maybe missing a comma?
| "<table>\n<thead>\n<tr><th style='text-align:left'>Name</th>" | |
| "<table>\n<thead>\n<tr><th style='text-align:left'>Name</th>", |
Refactored the codebase for the reader module to improve readability, maintainability, and performance. No functional changes have been introduced.
Closes #43