Skip to content

Refactor reader codebase#46

Merged
Mr-Milk merged 3 commits intomainfrom
fastslide
Nov 27, 2025
Merged

Refactor reader codebase#46
Mr-Milk merged 3 commits intomainfrom
fastslide

Conversation

@Mr-Milk
Copy link
Member

@Mr-Milk Mr-Milk commented Nov 27, 2025

Refactored the codebase for the reader module to improve readability, maintainability, and performance. No functional changes have been introduced.

Closes #43

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the reader module by introducing a registry-based system to manage slide image readers, replacing the previous utility functions in utils.py. The refactoring adds a centralized ReaderRegistry class with automatic reader discovery and makes ReaderBase an abstract base class with formal abstract methods.

Key changes:

  • Introduced ReaderRegistry class with decorator-based reader registration
  • Replaced utility functions (try_reader, get_reader) with registry-based READERS.try_open()
  • Enhanced ReaderBase with abstract method enforcement and is_available() class method
  • Added new FastSlideReader implementation

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
wsidata/reader/utils.py Removed entire file containing legacy reader utility functions
wsidata/reader/_reader_registry.py New registry system with ReaderRegistry class and @register decorator
wsidata/reader/base.py Enhanced with ABC inheritance, abstract methods, and is_available() method
wsidata/reader/openslide.py Added @register decorator and pkg_namespaces attribute
wsidata/reader/tiffslide.py Added @register decorator and pkg_namespaces attribute
wsidata/reader/cucim.py Added @register decorator, name, and pkg_namespaces attributes
wsidata/reader/bioformats.py Added @register decorator and pkg_namespaces attribute
wsidata/reader/fastslide.py New reader implementation for FastSlide library
wsidata/reader/init.py Updated imports to use READERS registry instead of utility functions
wsidata/io/_wsi.py Updated to use READERS.try_open() instead of try_reader()
wsidata/init.py Exported READERS instead of get_reader
docs/source/api/readers.rst Added documentation for the new READERS registry usage

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +68 to +80
def try_open(self, img_path: str, reader: str = None) -> ReaderBase:
if reader is not None:
reader = self[reader]
reader.is_available(raise_error=True)
return reader(img_path)
for reader in self.priority:
reader = self[reader]
if reader.is_available():
try:
return reader(img_path)
except Exception: # noqa
continue
raise ValueError(f"Cannot open image '{img_path}' using any of the readers.")
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new READERS registry and its try_open method lack direct test coverage. While the functionality is indirectly tested through open_wsi, consider adding tests that directly validate:

  • Reader registration and retrieval
  • try_open behavior with valid/invalid reader names
  • Fallback behavior when readers are unavailable
  • The is_available method

Copilot uses AI. Check for mistakes.
Comment on lines +301 to +303
if isinstance(cls.pkg_namespaces, str):
cls.pkg_namespaces = [cls.pkg_namespaces]
for pkg in cls.pkg_namespaces:
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The is_available method mutates the class attribute pkg_namespaces by converting it from a string to a list. This modification persists across all calls and could lead to unexpected behavior. Consider using a local variable instead:

@classmethod
def is_available(cls, raise_error=False):
    try:
        pkg_namespaces = cls.pkg_namespaces if isinstance(cls.pkg_namespaces, list) else [cls.pkg_namespaces]
        for pkg in pkg_namespaces:
            import_module(pkg)
    except ModuleNotFoundError:
        # ... rest of the code
Suggested change
if isinstance(cls.pkg_namespaces, str):
cls.pkg_namespaces = [cls.pkg_namespaces]
for pkg in cls.pkg_namespaces:
pkg_namespaces = cls.pkg_namespaces if isinstance(cls.pkg_namespaces, list) else [cls.pkg_namespaces]
for pkg in pkg_namespaces:

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +6


Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FastSlideReader class is missing the @register decorator. Without registration, this reader won't be available in the READERS registry and cannot be used through the new registry system. Add the decorator:

from ._reader_registry import register

@register(name="fastslide")
class FastSlideReader(ReaderBase):
Suggested change
from ._reader_registry import register
@register(name="fastslide")

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +77
for reader in self.priority:
reader = self[reader]
if reader.is_available():
try:
return reader(img_path)
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try_open method will raise a KeyError if any reader name in the priority list is not registered. The loop at line 73-74 calls self[reader] which raises KeyError for missing readers. This prevents the method from trying subsequent readers in the priority list. Consider using self._readers.get(reader) with a None check or wrapping the lookup in a try-except block.

Suggested change
for reader in self.priority:
reader = self[reader]
if reader.is_available():
try:
return reader(img_path)
for reader_name in self.priority:
reader_cls = self._readers.get(reader_name)
if reader_cls is None:
continue
if reader_cls.is_available():
try:
return reader_cls(img_path)

Copilot uses AI. Check for mistakes.
def __repr__(self) -> str:
lines = []
for name, available in self._repr_reader_order():
lines.append(f"{name} ({'✓ Available' if available else '✗ Not Install'})")
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error in the message: "Not Install" should be "Not Installed".

Copilot uses AI. Check for mistakes.
for name, available in self._repr_reader_order():
rows.append(
f"<tr><td style='text-align:left'>{escape(str(name))}</td>"
f"<td style='text-align:left'>{'✓ Available' if available else '✗ Not Install'}</td></tr>"
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error in the message: "Not Install" should be "Not Installed".

Copilot uses AI. Check for mistakes.

1. create the reader
2. assign the reader to self._reader
2. use set_reader to set the reader
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation suggests using set_reader() to set the reader, but all existing implementations (OpenSlideReader, TiffSlideReader, etc.) directly assign to self._reader. Either update the documentation to reflect actual usage patterns or update all implementations to use the set_reader() method. For consistency, the documentation should say "assign the reader to self._reader" as it was before the refactor.

Suggested change
2. use set_reader to set the reader
2. assign the reader to self._reader

Copilot uses AI. Check for mistakes.

1. Close the reader
2. Set self._reader to None
2. use set_reader to set the reader to None
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation suggests using set_reader() to set the reader to None, but all existing implementations (OpenSlideReader, TiffSlideReader, etc.) directly assign to self._reader. For consistency with the actual implementation pattern, the documentation should say "Set self._reader to None" as it was before the refactor.

Suggested change
2. use set_reader to set the reader to None
2. Set self._reader to None

Copilot uses AI. Check for mistakes.
def _repr_html_(self) -> str:
# HTML representation suitable for Jupyter rich display
rows = [
"<table>\n<thead>\n<tr><th style='text-align:left'>Name</th>"
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicit string concatenation. Maybe missing a comma?

Suggested change
"<table>\n<thead>\n<tr><th style='text-align:left'>Name</th>"
"<table>\n<thead>\n<tr><th style='text-align:left'>Name</th>",

Copilot uses AI. Check for mistakes.
@Mr-Milk Mr-Milk merged commit 96722c9 into main Nov 27, 2025
9 of 11 checks passed
@Mr-Milk Mr-Milk deleted the fastslide branch November 27, 2025 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Backend] Support Fastslide

1 participant