Skip to content

Commit 8dc8167

Browse files
authored
Merge pull request #83 from bcdev/forman-82-slice_source_is_cm
Made writing custom slice sources easier
2 parents 70e739c + c9228ae commit 8dc8167

File tree

17 files changed

+297
-59
lines changed

17 files changed

+297
-59
lines changed

CHANGES.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,15 @@
1+
## Version 0.7.0 (in development)
2+
3+
* Made writing custom slice sources easier: (#82)
4+
5+
- Slice items can now be a `contextlib.AbstractContextManager`
6+
so custom slice functions can now be used with
7+
[@contextlib.contextmanager](https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager).
8+
9+
- Introduced `SliceSource.close()` so
10+
[contextlib.closing()](https://docs.python.org/3/library/contextlib.html#contextlib.closing)
11+
is applicable. Deprecated `SliceSource.dispose()`.
12+
113
## Version 0.6.0 (from 2024-03-12)
214

315
### Enhancements

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,8 @@ The `zappend` tool provides the following features:
6161
[`zappend`](cli.md) command or from Python. When used from Python using the
6262
[`zappend()`](api.md) function, slice datasets can be passed as local file
6363
paths, URIs, as datasets of type
64-
[xarray.Dataset](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html), or as custom
65-
[zappend.api.SliceSource](https://bcdev.github.io/zappend/api/#class-slicesource) objects.
64+
[xarray.Dataset](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html), or as custom
65+
[slice sources](https://bcdev.github.io/zappend/guide/#slice-sources).
6666

6767

6868
More about zappend can be found in its

docs/config.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -193,7 +193,7 @@ Options for the filesystem given by the URI of `target_dir`.
193193
## `slice_source`
194194

195195
Type _string_.
196-
The fully qualified name of a class or function that provides a slice source for each slice item. If a class is given, it must be derived from `zappend.api.SliceSource`. If a function is given, it must return an instance of `zappend.api.SliceSource`. Refer to the user guide for more information.
196+
The fully qualified name of a class or function that receives a slice item as argument(s) and provides the slice dataset. If a class is given, it must be derived from `zappend.api.SliceSource`. If the function is a context manager, it must yield an `xarray.Dataset`. If a plain function is given, it must return any valid slice item type. Refer to the user guide for more information.
197197

198198
## `slice_engine`
199199

docs/guide.md

Lines changed: 42 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -694,10 +694,14 @@ at the cost of additional i/o. It therefore defaults to `false`.
694694

695695
If you need some custom cleanup after a slice has been processed and appended to the
696696
target dataset, you can use instances of `zappend.api.SliceSource` as slice items.
697-
A `SliceSource` class requires you to implement two methods:
697+
The `SliceSource` methods with special meaning are:
698698

699-
* `get_dataset()` to return the slice dataset of type `xarray.Dataset`, and
700-
* `dispose()` to perform any resource cleanup tasks.
699+
* `get_dataset()`: a zero-argument method that returns the slice dataset of type
700+
`xarray.Dataset`. You must implement this abstract method.
701+
* `close()`: perform any resource cleanup tasks
702+
(in zappend < v0.7, the `close` method was called `dispose`).
703+
* `__init__()`: optional constructor that receives any arguments passed to the
704+
slice source.
701705

702706
Here is the template code for your own slice source implementation:
703707

@@ -718,7 +722,7 @@ class MySliceSource(SliceSource):
718722
# You can put any processing here.
719723
return self.ds
720724

721-
def dispose(self):
725+
def close(self):
722726
# Write code here that performs cleanup.
723727
if self.ds is not None:
724728
self.ds.close()
@@ -795,7 +799,7 @@ class MySliceSource(SliceSource):
795799
self.ds = xr.open_dataset(self.slice_path)
796800
return get_mean_slice(self.ds)
797801

798-
def dispose(self):
802+
def close(self):
799803
if self.ds is not None:
800804
self.ds.close()
801805
self.ds = None
@@ -805,6 +809,39 @@ zappend(["slice-1.nc", "slice-2.nc", "slice-3.nc"],
805809
slice_source=MySliceSource)
806810
```
807811

812+
Since zappend 0.7, a slice source can also be written as a Python
813+
[context manager](https://docs.python.org/3/library/contextlib.html),
814+
which allows you implementing the `get_dataset()` and `close()`
815+
methods in one single function, instead of a class. Here is the above example
816+
written as context manager.
817+
818+
```python
819+
from contextlib import contextmanager
820+
import numpy as np
821+
import xarray as xr
822+
from zappend.api import zappend
823+
824+
# Same as above here
825+
826+
@contextmanager
827+
def get_slice_dataset(slice_path):
828+
# allocate resources here
829+
ds = xr.open_dataset(slice_path)
830+
mean_ds = get_mean_slice(ds)
831+
try:
832+
# yield (!) the slice dataset
833+
# so it can be appended
834+
yield mean_ds
835+
finally:
836+
# after slice dataset has been appended
837+
# release resources here
838+
ds.close()
839+
840+
zappend(["slice-1.nc", "slice-2.nc", "slice-3.nc"],
841+
target_dir="target.zarr",
842+
slice_source=get_slice_dataset)
843+
```
844+
808845
## Profiling
809846

810847
Runtime profiling is very important for understanding program runtime behavior

docs/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,8 @@ The `zappend` tool provides the following features:
3939
[`zappend`](cli.md) command or from Python. When used from Python using the
4040
[`zappend()`](api.md) function, slice datasets can be passed as local file
4141
paths, URIs, as datasets of type
42-
[xarray.Dataset](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html), or as custom
43-
[zappend.api.SliceSource](https://bcdev.github.io/zappend/api/#class-slicesource) objects.
42+
[xarray.Dataset](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html), or as custom
43+
[slice sources](guide#slice-sources).
4444

4545
## How It Works
4646

tests/slice/test_cm.py

Lines changed: 115 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
# Permissions are hereby granted under the terms of the MIT License:
33
# https://opensource.org/licenses/MIT.
44

5+
import contextlib
56
import shutil
67
import unittest
78
import warnings
@@ -13,6 +14,7 @@
1314
from zappend.fsutil.fileobj import FileObj
1415
from zappend.slice.cm import SliceSourceContextManager
1516
from zappend.slice.cm import open_slice_dataset
17+
from zappend.slice.source import SliceSource
1618
from zappend.slice.sources.memory import MemorySliceSource
1719
from zappend.slice.sources.persistent import PersistentSliceSource
1820
from zappend.slice.sources.temporary import TemporarySliceSource
@@ -23,17 +25,17 @@
2325
# noinspection PyUnusedLocal
2426

2527

26-
# noinspection PyShadowingBuiltins
28+
# noinspection PyShadowingBuiltins,PyRedeclaration
2729
class OpenSliceDatasetTest(unittest.TestCase):
2830
def setUp(self):
2931
clear_memory_fs()
3032

31-
# noinspection PyMethodMayBeStatic
3233
def test_slice_item_is_slice_source(self):
3334
dataset = make_test_dataset()
3435
ctx = Context(dict(target_dir="memory://target.zarr"))
3536
slice_item = MemorySliceSource(dataset, 0)
3637
slice_cm = open_slice_dataset(ctx, slice_item)
38+
self.assertIsInstance(slice_cm, SliceSourceContextManager)
3739
self.assertIs(slice_item, slice_cm.slice_source)
3840

3941
def test_slice_item_is_dataset(self):
@@ -127,7 +129,6 @@ def test_slice_item_is_uri_with_polling_ok(self):
127129
with slice_cm as slice_ds:
128130
self.assertIsInstance(slice_ds, xr.Dataset)
129131

130-
# noinspection PyMethodMayBeStatic
131132
def test_slice_item_is_uri_with_polling_fail(self):
132133
slice_dir = FileObj("memory://slice.zarr")
133134
ctx = Context(
@@ -140,3 +141,114 @@ def test_slice_item_is_uri_with_polling_fail(self):
140141
with pytest.raises(FileNotFoundError, match=slice_dir.uri):
141142
with slice_cm:
142143
pass
144+
145+
def test_slice_item_is_context_manager(self):
146+
@contextlib.contextmanager
147+
def get_dataset(name):
148+
uri = f"memory://{name}.zarr"
149+
ds = make_test_dataset(uri=uri)
150+
try:
151+
yield ds
152+
finally:
153+
ds.close()
154+
FileObj(uri).delete(recursive=True)
155+
156+
ctx = Context(
157+
dict(
158+
target_dir="memory://target.zarr",
159+
slice_source=get_dataset,
160+
)
161+
)
162+
slice_cm = open_slice_dataset(ctx, "bibo")
163+
self.assertIsInstance(slice_cm, contextlib.AbstractContextManager)
164+
with slice_cm as slice_ds:
165+
self.assertIsInstance(slice_ds, xr.Dataset)
166+
167+
def test_slice_item_is_slice_source(self):
168+
class MySliceSource(SliceSource):
169+
def __init__(self, name):
170+
self.uri = f"memory://{name}.zarr"
171+
self.ds = None
172+
173+
def get_dataset(self):
174+
self.ds = make_test_dataset(uri=self.uri)
175+
return self.ds
176+
177+
def close(self):
178+
if self.ds is not None:
179+
self.ds.close()
180+
FileObj(uri=self.uri).delete(recursive=True)
181+
182+
ctx = Context(
183+
dict(
184+
target_dir="memory://target.zarr",
185+
slice_source=MySliceSource,
186+
)
187+
)
188+
slice_cm = open_slice_dataset(ctx, "bibo")
189+
self.assertIsInstance(slice_cm, SliceSourceContextManager)
190+
self.assertIsInstance(slice_cm.slice_source, SliceSource)
191+
with slice_cm as slice_ds:
192+
self.assertIsInstance(slice_ds, xr.Dataset)
193+
194+
def test_slice_item_is_deprecated_slice_source(self):
195+
class MySliceSource(SliceSource):
196+
def __init__(self, name):
197+
self.uri = f"memory://{name}.zarr"
198+
self.ds = None
199+
200+
def get_dataset(self):
201+
self.ds = make_test_dataset(uri=self.uri)
202+
return self.ds
203+
204+
def dispose(self):
205+
if self.ds is not None:
206+
self.ds.close()
207+
FileObj(uri=self.uri).delete(recursive=True)
208+
209+
ctx = Context(
210+
dict(
211+
target_dir="memory://target.zarr",
212+
slice_source=MySliceSource,
213+
)
214+
)
215+
slice_cm = open_slice_dataset(ctx, "bibo")
216+
self.assertIsInstance(slice_cm, SliceSourceContextManager)
217+
self.assertIsInstance(slice_cm.slice_source, SliceSource)
218+
with pytest.warns(expected_warning=DeprecationWarning):
219+
with slice_cm as slice_ds:
220+
self.assertIsInstance(slice_ds, xr.Dataset)
221+
222+
223+
class IsContextManagerTest(unittest.TestCase):
224+
"""Assert that context managers are identified by isinstance()"""
225+
226+
def test_context_manager_class(self):
227+
@contextlib.contextmanager
228+
def my_slice_source(data):
229+
ds = xr.Dataset(data)
230+
try:
231+
yield ds
232+
finally:
233+
ds.close()
234+
235+
item = my_slice_source([1, 2, 3])
236+
self.assertTrue(isinstance(item, contextlib.AbstractContextManager))
237+
self.assertFalse(isinstance(my_slice_source, contextlib.AbstractContextManager))
238+
239+
def test_context_manager_protocol(self):
240+
class MySliceSource:
241+
def __enter__(self):
242+
return xr.Dataset()
243+
244+
def __exit__(self, *exc):
245+
pass
246+
247+
item = MySliceSource()
248+
self.assertTrue(isinstance(item, contextlib.AbstractContextManager))
249+
self.assertFalse(isinstance(MySliceSource, contextlib.AbstractContextManager))
250+
251+
def test_dataset(self):
252+
item = xr.Dataset()
253+
self.assertTrue(isinstance(item, contextlib.AbstractContextManager))
254+
self.assertFalse(isinstance(xr.Dataset, contextlib.AbstractContextManager))

tests/slice/test_source.py

Lines changed: 28 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,19 @@
22
# Permissions are hereby granted under the terms of the MIT License:
33
# https://opensource.org/licenses/MIT.
44

5-
import shutil
65
import unittest
7-
import warnings
86

97
import pytest
108
import xarray as xr
119

1210
from zappend.context import Context
1311
from zappend.fsutil.fileobj import FileObj
14-
from zappend.slice.cm import SliceSourceContextManager
15-
from zappend.slice.cm import open_slice_dataset
16-
from zappend.slice.source import to_slice_source, SliceSource
12+
from zappend.slice.source import SliceSource
13+
from zappend.slice.source import to_slice_source
1714
from zappend.slice.sources.memory import MemorySliceSource
1815
from zappend.slice.sources.persistent import PersistentSliceSource
1916
from zappend.slice.sources.temporary import TemporarySliceSource
2017
from tests.helpers import clear_memory_fs
21-
from tests.helpers import make_test_dataset
22-
from tests.config.test_config import CustomSliceSource
2318

2419

2520
# noinspection PyUnusedLocal
@@ -86,22 +81,43 @@ def my_slice_source(arg1, arg2=None, ctx=None):
8681
return xr.Dataset(attrs=dict(arg1=arg1, arg2=arg2, ctx=ctx))
8782

8883
ctx = make_ctx(slice_source=my_slice_source)
89-
arg = xr.Dataset()
9084
slice_source = to_slice_source(ctx, ([13], {"arg2": True}), 0)
9185
self.assertIsInstance(slice_source, MemorySliceSource)
9286
ds = slice_source.get_dataset()
9387
self.assertEqual(13, ds.attrs.get("arg1"))
9488
self.assertEqual(True, ds.attrs.get("arg2"))
9589
self.assertIs(ctx, ds.attrs.get("ctx"))
9690

91+
def test_slice_item_is_slice_source_context_manager(self):
92+
import contextlib
93+
94+
@contextlib.contextmanager
95+
def my_slice_source(ctx, arg1, arg2=None):
96+
_ds = xr.Dataset(attrs=dict(arg1=arg1, arg2=arg2, ctx=ctx))
97+
try:
98+
yield _ds
99+
finally:
100+
_ds.close()
101+
102+
ctx = make_ctx(slice_source=my_slice_source)
103+
slice_source = to_slice_source(ctx, ([14], {"arg2": "OK"}), 0)
104+
self.assertIsInstance(slice_source, contextlib.AbstractContextManager)
105+
with slice_source as ds:
106+
self.assertIsInstance(ds, xr.Dataset)
107+
self.assertEqual(14, ds.attrs.get("arg1"))
108+
self.assertEqual("OK", ds.attrs.get("arg2"))
109+
self.assertIs(ctx, ds.attrs.get("ctx"))
110+
97111
# noinspection PyMethodMayBeStatic
98112
def test_raises_if_slice_item_is_int(self):
99113
ctx = make_ctx(persist_mem_slices=True)
100114
with pytest.raises(
101115
TypeError,
102116
match=(
103117
"slice_item must have type str, xarray.Dataset,"
104-
" zappend.api.FileObj, zappend.api.SliceSource, but was type int"
118+
" contextlib.AbstractContextManager,"
119+
" zappend.api.FileObj, zappend.api.SliceSource,"
120+
" but was type int"
105121
),
106122
):
107123
to_slice_source(ctx, 42, 0)
@@ -116,7 +132,9 @@ def hallo():
116132
TypeError,
117133
match=(
118134
"slice_item must have type str, xarray.Dataset,"
119-
" zappend.api.FileObj, zappend.api.SliceSource, but was type function"
135+
" contextlib.AbstractContextManager,"
136+
" zappend.api.FileObj, zappend.api.SliceSource,"
137+
" but was type function"
120138
),
121139
):
122140
to_slice_source(ctx, hallo, 0)

0 commit comments

Comments
 (0)