Skip to content

Commit e91fc9a

Browse files
authored
Merge pull request #107 from sebi06/work_in_progress
Add read_scenes function, URL support, and improved lazy loading
2 parents c599a4b + 4a2441f commit e91fc9a

File tree

11 files changed

+1476
-151
lines changed

11 files changed

+1476
-151
lines changed

.github/copilot-instructions.md

Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
# Copilot Instructions for czitools
2+
3+
This document provides guidelines for GitHub Copilot when working with the czitools repository.
4+
5+
## Project Overview
6+
7+
**czitools** is a Python package for reading CZI (Carl Zeiss Image) pixel and metadata. It simplifies working with CZI microscopy image files by providing tools for metadata extraction and pixel data reading.
8+
9+
### Key Dependencies
10+
- `pylibCZIrw` - Core library for reading/writing CZI files
11+
- `aicspylibczi` - Additional CZI functionality
12+
- `numpy` - Array operations
13+
- `dask` - Lazy/delayed array operations
14+
- `xarray` - Labeled multi-dimensional arrays
15+
- `pandas` - Data manipulation (planetables)
16+
- `python-box` - Dictionary access via attributes
17+
- `pydantic` - Data validation
18+
- `loguru` / `colorlog` - Logging
19+
20+
### Supported Python Versions
21+
- Python 3.10, 3.11, 3.12, 3.13
22+
23+
### Supported Operating Systems
24+
- Windows
25+
- Linux
26+
- macOS (with manual pylibCZIrw wheel installation)
27+
28+
## Project Structure
29+
30+
```
31+
src/czitools/
32+
├── metadata_tools/ # Classes for extracting CZI metadata
33+
│ ├── czi_metadata.py # Main CziMetadata class
34+
│ ├── dimension.py # CziDimensions
35+
│ ├── scaling.py # CziScaling
36+
│ ├── channel.py # CziChannelInfo
37+
│ ├── boundingbox.py # CziBoundingBox
38+
│ ├── objective.py # CziObjectives
39+
│ ├── detector.py # CziDetector
40+
│ ├── microscope.py # CziMicroscope
41+
│ ├── sample.py # CziSampleInfo
42+
│ └── add_metadata.py # CziAddMetaData
43+
├── read_tools/ # Functions for reading pixel data
44+
│ └── read_tools.py # read_6darray, read_mdarray, etc.
45+
├── utils/ # Utility modules
46+
│ ├── logging_tools.py # Logging configuration
47+
│ ├── box.py # Box utilities for metadata
48+
│ ├── misc.py # Miscellaneous helpers
49+
│ ├── pixels.py # Pixel type utilities
50+
│ └── planetable.py # Planetable generation
51+
├── visu_tools/ # Visualization utilities
52+
└── _tests/ # Test suite
53+
```
54+
55+
## Coding Conventions
56+
57+
### Python Style
58+
- Use Python 3.10+ syntax and type hints
59+
- Follow PEP 8 style guidelines
60+
- Use `dataclass` for metadata classes with `@dataclass` decorator
61+
- Use `field(init=False, default=None)` for computed fields in dataclasses
62+
- Prefer `Optional[Type]` for nullable types
63+
- Use `Union[str, os.PathLike[str]]` for file paths
64+
65+
### Type Annotations
66+
```python
67+
from typing import List, Dict, Tuple, Optional, Any, Union
68+
from dataclasses import dataclass, field
69+
70+
@dataclass
71+
class ExampleMetadata:
72+
filepath: Union[str, os.PathLike[str]]
73+
value: Optional[float] = field(init=False, default=None)
74+
items: Optional[List[str]] = field(init=False, default_factory=lambda: [])
75+
```
76+
77+
### Imports Organization
78+
1. Standard library imports
79+
2. Third-party imports (numpy, pandas, etc.)
80+
3. Local imports from czitools
81+
82+
```python
83+
# Standard library
84+
from typing import Dict, Tuple, Optional, Union
85+
import os
86+
from pathlib import Path
87+
from dataclasses import dataclass, field
88+
89+
# Third-party
90+
import numpy as np
91+
from box import Box
92+
from pylibCZIrw import czi as pyczi
93+
94+
# Local
95+
from czitools.utils import logging_tools
96+
from czitools.metadata_tools.helper import ValueRange
97+
```
98+
99+
### Logging
100+
- Use the custom logging setup from `czitools.utils.logging_tools`
101+
- Initialize logger at module level: `logger = logging_tools.set_logging()`
102+
- Use `logger.info()`, `logger.warning()`, `logger.error()` for messages
103+
- Use `verbose` parameter in classes to control logging output
104+
105+
```python
106+
from czitools.utils import logging_tools
107+
logger = logging_tools.set_logging()
108+
109+
if self.verbose:
110+
logger.info("Processing completed successfully")
111+
```
112+
113+
### File Path Handling
114+
- Accept both `str` and `os.PathLike[str]` (Path objects)
115+
- Convert Path to string when needed: `str(filepath)`
116+
- Use `pathlib.Path` for path manipulations
117+
- Support URL paths using `validators.url()` check
118+
119+
```python
120+
from pathlib import Path
121+
122+
if isinstance(self.filepath, Path):
123+
self.filepath = str(self.filepath)
124+
```
125+
126+
### Error Handling
127+
- Use defensive programming with fallback values
128+
- Guard against None values and division by zero
129+
- Use `try/except` blocks for external library calls
130+
- Return None or sensible defaults instead of raising exceptions when appropriate
131+
132+
```python
133+
# Safe value extraction with fallback
134+
try:
135+
value = float(data.Value) * 1000000
136+
if value == 0.0:
137+
value = 1.0 # fallback
138+
except (AttributeError, TypeError):
139+
value = None
140+
```
141+
142+
### Docstrings
143+
- Use Google-style docstrings
144+
- Include Args, Returns, and Raises sections
145+
- Document class attributes in class docstring
146+
147+
```python
148+
def read_6darray(
149+
filepath: Union[str, os.PathLike[str]],
150+
use_dask: Optional[bool] = False,
151+
zoom: Optional[float] = 1.0,
152+
) -> Tuple[Optional[np.ndarray], CziMetadata]:
153+
"""Read a CZI image file as 6D array.
154+
155+
Args:
156+
filepath: Path to the CZI image file.
157+
use_dask: Option to use dask for delayed reading.
158+
zoom: Downscale factor [0.01 - 1.0].
159+
160+
Returns:
161+
Tuple of (array6d, metadata) where array6d may be None on error.
162+
"""
163+
```
164+
165+
## Testing Guidelines
166+
167+
### Test Location
168+
- Tests are in `src/czitools/_tests/`
169+
- Test files follow pattern: `test_*.py`
170+
- Use pytest as the test framework
171+
172+
### Test Structure
173+
```python
174+
from czitools.metadata_tools import czi_metadata as czimd
175+
from pathlib import Path
176+
import pytest
177+
from typing import List, Any
178+
179+
basedir = Path(__file__).resolve().parents[3]
180+
181+
@pytest.mark.parametrize(
182+
"czifile, expected_value",
183+
[
184+
("CellDivision_T3_Z5_CH2_X240_Y170.czi", [None, 3, 5, 2, 170, 240])
185+
]
186+
)
187+
def test_example(czifile: str, expected_value: List[Any]) -> None:
188+
filepath = basedir / "data" / czifile
189+
# Test implementation
190+
assert result == expected_value
191+
```
192+
193+
### Test Data
194+
- Test CZI files are in `data/` directory
195+
- Use parametrized tests for multiple test cases
196+
- Reference test files relative to `basedir`
197+
198+
### Running Tests
199+
```bash
200+
pytest src/czitools/_tests/
201+
pytest -m "not network" # Skip network tests
202+
```
203+
204+
## Common Patterns
205+
206+
### Reading Metadata
207+
```python
208+
from czitools.metadata_tools.czi_metadata import CziMetadata
209+
from czitools.metadata_tools.scaling import CziScaling
210+
from czitools.metadata_tools.dimension import CziDimensions
211+
212+
# Get all metadata at once
213+
mdata = CziMetadata(filepath)
214+
215+
# Or get specific metadata
216+
scaling = CziScaling(filepath)
217+
dimensions = CziDimensions(filepath)
218+
```
219+
220+
### Reading Pixel Data
221+
```python
222+
from czitools.read_tools import read_tools
223+
224+
# Read as 6D array (STCZYX order)
225+
array6d, mdata = read_tools.read_6darray(
226+
filepath,
227+
use_dask=True, # For large files
228+
use_xarray=True, # For labeled dimensions
229+
zoom=0.5 # Downscale
230+
)
231+
```
232+
233+
### Using Box for Metadata
234+
```python
235+
from czitools.utils.box import get_czimd_box
236+
237+
# Get metadata as Box object for attribute-style access
238+
czi_box = get_czimd_box(filepath)
239+
scaling = czi_box.ImageDocument.Metadata.Scaling.Items.Distance
240+
```
241+
242+
## Array Dimension Order
243+
244+
CZI arrays use the dimension order: **STCZYX(A)**
245+
- S = Scene
246+
- T = Time
247+
- C = Channel
248+
- Z = Z-slice
249+
- Y = Y dimension
250+
- X = X dimension
251+
- A = Alpha/RGB component (optional)
252+
253+
## Additional Notes
254+
255+
### Metadata Classes Pattern
256+
All metadata classes follow a similar pattern:
257+
1. Accept `czisource` as filepath, Path, or Box object
258+
2. Use `@dataclass` with `field(init=False)` for computed attributes
259+
3. Implement `__post_init__` for initialization logic
260+
4. Support `verbose` parameter for logging control
261+
262+
### Scaling Units
263+
- Internal scaling values are in **microns**
264+
- Conversion from CZI values: `value * 1000000` (meters to microns)
265+
266+
### RGB Support
267+
- Check `isRGB` dictionary for RGB status per channel
268+
- RGB images have an additional 'A' dimension
269+
270+
### Scene Handling
271+
- CZI files may have multiple scenes
272+
- Check `has_scenes` and `SizeS` for scene information
273+
- Use `bbox.total_bounding_box` for combined bounds

demo/scripts/read_lazy_demo.py

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
"""Demo: using read_6darray_lazy() from czitools.read_tools.
2+
3+
This script demonstrates the read_6darray_lazy() function which reads a CZI file
4+
as a 6D dask array with delayed plane reading. The actual pixel data is only
5+
loaded when accessed (e.g., via .compute() or indexing).
6+
7+
Features:
8+
- Lazy loading via dask arrays (no data read until needed)
9+
- Optional xarray DataArray output with labeled dimensions
10+
- Optional Z-stack chunking for efficient processing
11+
- Substack selection via planes parameter
12+
13+
Dimension order is always: STCZYX (or STCZYXA for RGB images)
14+
"""
15+
16+
from czitools.read_tools import read_tools
17+
18+
# Test file - same as read_mdstack.py
19+
filepath = r"F:\Testdata_Zeiss\CZI_Testfiles\WP96_4Pos_B4-10_DAPI.czi"
20+
21+
22+
if __name__ == "__main__":
23+
print("=" * 60)
24+
print("Demo: read_6darray_lazy()")
25+
print("=" * 60)
26+
27+
# Basic lazy loading - returns dask array
28+
print("\n1. Basic lazy loading (dask array):")
29+
array6d, mdata = read_tools.read_6darray_lazy(
30+
filepath,
31+
use_xarray=False, # Return plain dask array
32+
)
33+
34+
print(f" Array type: {type(array6d)}")
35+
print(f" Shape: {array6d.shape}")
36+
print(f" Dtype: {array6d.dtype}")
37+
print(f" Chunks: {array6d.chunks}")
38+
print(f" No data loaded yet - array is lazy!")
39+
40+
# With xarray for labeled dimensions
41+
print("\n2. Lazy loading with xarray:")
42+
array6d_xr, mdata = read_tools.read_6darray_lazy(
43+
filepath,
44+
use_xarray=True, # Return xr.DataArray with labeled dims
45+
)
46+
47+
print(f" Array type: {type(array6d_xr)}")
48+
print(f" Dimensions: {array6d_xr.dims}")
49+
print(f" Shape: {array6d_xr.shape}")
50+
print(f" Coordinates: {list(array6d_xr.coords.keys())}")
51+
52+
# With Z-stack chunking (useful for processing)
53+
print("\n3. Lazy loading with Z-stack chunking:")
54+
array6d_chunked, mdata = read_tools.read_6darray_lazy(
55+
filepath,
56+
chunk_zyx=True, # Chunk so each Z-stack is one chunk
57+
use_xarray=True,
58+
)
59+
60+
print(f" Chunks: {array6d_chunked.chunks}")
61+
62+
# Reading a substack (only specific planes)
63+
print("\n4. Reading a substack (first 2 scenes only):")
64+
array6d_sub, mdata = read_tools.read_6darray_lazy(
65+
filepath,
66+
planes={"S": (0, 1)}, # Only scenes 0 and 1
67+
use_xarray=True,
68+
)
69+
70+
print(f" Shape: {array6d_sub.shape}")
71+
print(f" Dimensions: {array6d_sub.dims}")
72+
73+
# Actually load some data
74+
print("\n5. Loading a subset of data:")
75+
# Select first scene, first timepoint, first channel
76+
subset = array6d_xr.isel(S=0, T=0, C=0)
77+
print(f" Subset shape (before compute): {subset.shape}")
78+
79+
# This triggers the actual read
80+
subset_loaded = subset.compute()
81+
print(f" Subset shape (after compute): {subset_loaded.shape}")
82+
print(f" Data loaded! Min={subset_loaded.values.min()}, Max={subset_loaded.values.max()}")
83+
84+
# Show metadata info
85+
print("\n6. Metadata from CZI:")
86+
print(f" Filepath: {mdata.filepath}")
87+
print(f" Pixel type: {mdata.npdtype_list}")
88+
print(
89+
f" Dimensions: S={mdata.image.SizeS}, T={mdata.image.SizeT}, " f"C={mdata.image.SizeC}, Z={mdata.image.SizeZ}"
90+
)
91+
print(f" Scaling (µm): X={mdata.scale.X}, Y={mdata.scale.Y}, Z={mdata.scale.Z}")
92+
93+
print("\n" + "=" * 60)
94+
print("Demo complete!")
95+
print("=" * 60)

0 commit comments

Comments
 (0)