Skip to content

Commit 49c7c8f

Browse files
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
1 parent c06013c commit 49c7c8f

File tree

1 file changed

+11
-4
lines changed

1 file changed

+11
-4
lines changed

src/posts/xarray-kvikio/index.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,17 +26,18 @@ For Xarray, the key bit is that kvikio exposes a zarr store [kvikio.zarr.GDSStor
2626
## Integrating with Xarray
2727

2828
Getting all this to work nicely requires using three in-progress pull requests that
29+
2930
1. [Teach Zarr to handle alternative array classes](https://github.com/zarr-developers/zarr-python/pull/934)
3031
2. [Rewrite a small bit of Xarray to not cast all data to a numpy array after read from disk](https://github.com/pydata/xarray/pull/6874)
3132
3. [Make a backend that connects Xarray to Kvikio](https://github.com/xarray-contrib/cupy-xarray/pull/10)
3233

3334
Writing the backend for Xarray was relatively easily. Most of the code was copied over from the existing Zarr backend. Most of the effort was in ensuring that dimension coordinates could be read in directly to host memory without raising an error. This is required because Xarrays creates `pandas.Index` objects for such variables. In the future, we could consider using `cudf.Index` instead to allow a fully GPU-backed Xarray object.
3435

35-
## Usage
36+
## Usage
3637

3738
Assuming you have all the pieces together (see [Appendix I]() and [Appendix II]() for step-by-step instructions), then using all this cool technology only requires adding `engine="kvikio"` to your `open_dataset` line (!)
3839

39-
``` python
40+
```python
4041
import xarray as xr
4142

4243
ds = xr.open_dataset("file.zarr", engine="kvikio", consolidated=False)
@@ -45,12 +46,14 @@ ds = xr.open_dataset("file.zarr", engine="kvikio", consolidated=False)
4546
With this `ds.load()` will load directly to GPU memory and `ds` will now contain CuPy arrays.
4647

4748
At present there are a few limitations:
48-
1. stores cannot be read with consolidated metadata, and
49+
50+
1. stores cannot be read with consolidated metadata, and
4951
2. compression is unsupported by the backend.
5052

5153
## Quick demo
5254

5355
First create an example uncompressed dataset to read from
56+
5457
```
5558
import xarray as xr
5659
@@ -64,16 +67,19 @@ airt.to_zarr(store, mode="w", consolidated=True)
6467
```
6568

6669
Now read
70+
6771
```
6872
# consolidated must be False
6973
ds = xr.open_dataset(store, engine="kvikio", consolidated=False)
7074
ds
7175
```
7276

7377
Now load a small subset
74-
``` python
78+
79+
```python
7580
type(ds["air"].isel(time=0, lat=10).load().data)
7681
```
82+
7783
```
7884
cupy._core.core.ndarray
7985
```
@@ -124,6 +130,7 @@ jupyter lab --no-browser
124130
## Appendix II : making sure GDS is working
125131

126132
Scott Henderson (@scottyhq) pointed out that running `python kvikio/python/benchmarks/single-node-io.py` prints nice diagnostic information that lets you check whether GDS is set up. Note that on our system, we have "compatibility mode" enabled. So we don't see the benefits now but this was enough to wire everything up.
133+
127134
```
128135
----------------------------------
129136
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

0 commit comments

Comments
 (0)