Skip to content

Commit b8d357f

Browse files
committed
edits
1 parent c06013c commit b8d357f

File tree

1 file changed

+48
-11
lines changed

1 file changed

+48
-11
lines changed

src/posts/xarray-kvikio/index.md

Lines changed: 48 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -15,42 +15,52 @@ We [demonstrate](https://github.com/xarray-contrib/cupy-xarray/pull/10) register
1515

1616
### What is GPU Direct Storage
1717

18+
Quoting [this nVIDIA blogpost](https://developer.nvidia.com/blog/gpudirect-storage/)
19+
20+
> I/O, the process of loading data from storage to GPUs for processing, has historically been controlled by the CPU. As computation shifts from slower CPUs to faster GPUs, I/O becomes more of a bottleneck to overall application performance.
21+
> Just as GPUDirect RDMA (Remote Direct Memory Address) improved bandwidth and latency when moving data directly between a network interface card (NIC) and GPU memory, a new technology called GPUDirect Storage enables a direct data path between local or remote storage, like NVMe or NVMe over Fabric (NVMe-oF), and GPU memory.
22+
> Both GPUDirect RDMA and GPUDirect Storage avoid extra copies through a bounce buffer in the CPU’s memory and enable a direct memory access (DMA) engine near the NIC or storage to move data on a direct path into or out of GPU memory, all without burdening the CPU or GPU
23+
> For GPUDirect Storage, storage location doesn’t matter; it could be inside an enclosure, within the rack, or connected over the network.
24+
1825
Insert https://developer.nvidia.com/blog/wp-content/uploads/2019/08/GPUDirect-Fig-1-New.png somehow
1926

2027
### What is Kvikio
2128

2229
> kvikIO is a Python library providing bindings to cuFile, which enables GPUDirectStorage (GDS).
2330
24-
For Xarray, the key bit is that kvikio exposes a zarr store [kvikio.zarr.GDSStore](https://docs.rapids.ai/api/kvikio/stable/api.html#zarr) that does all the hard work for us. Since Xarray knows how to read Zarr stores, we can adapt that in a new storage backend. And thanks to recent work funded by the Chan Zuckerberg Initiative, adding a [new backend](https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html) is quite easy!
31+
For Xarray, the key bit is that kvikio exposes a zarr store [`kvikio.zarr.GDSStore`](https://docs.rapids.ai/api/kvikio/stable/api.html#zarr) that does all the hard work for us. Since Xarray knows how to read Zarr stores, we can adapt that to create a new storage backend that uses `kvikio`. And thanks to recent work funded by the Chan Zuckerberg Initiative, adding a [new backend](https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html) is quite easy!
2532

2633
## Integrating with Xarray
2734

2835
Getting all this to work nicely requires using three in-progress pull requests that
36+
2937
1. [Teach Zarr to handle alternative array classes](https://github.com/zarr-developers/zarr-python/pull/934)
3038
2. [Rewrite a small bit of Xarray to not cast all data to a numpy array after read from disk](https://github.com/pydata/xarray/pull/6874)
3139
3. [Make a backend that connects Xarray to Kvikio](https://github.com/xarray-contrib/cupy-xarray/pull/10)
3240

3341
Writing the backend for Xarray was relatively easily. Most of the code was copied over from the existing Zarr backend. Most of the effort was in ensuring that dimension coordinates could be read in directly to host memory without raising an error. This is required because Xarrays creates `pandas.Index` objects for such variables. In the future, we could consider using `cudf.Index` instead to allow a fully GPU-backed Xarray object.
3442

35-
## Usage
43+
## Usage
3644

3745
Assuming you have all the pieces together (see [Appendix I]() and [Appendix II]() for step-by-step instructions), then using all this cool technology only requires adding `engine="kvikio"` to your `open_dataset` line (!)
3846

39-
``` python
47+
```python
4048
import xarray as xr
4149

4250
ds = xr.open_dataset("file.zarr", engine="kvikio", consolidated=False)
4351
```
4452

45-
With this `ds.load()` will load directly to GPU memory and `ds` will now contain CuPy arrays.
53+
Notice that importing `cupy_xarray` was not needed. `cupy_xarray` uses entrypoints to register the Kvikio backend with Xarray.
54+
55+
With this `ds.load()` will load directly to GPU memory and `ds` will now contain CuPy arrays. At present there are a few limitations:
4656

47-
At present there are a few limitations:
48-
1. stores cannot be read with consolidated metadata, and
57+
1. stores cannot be read with consolidated metadata, and
4958
2. compression is unsupported by the backend.
5059

5160
## Quick demo
5261

5362
First create an example uncompressed dataset to read from
63+
5464
```
5565
import xarray as xr
5666
@@ -64,23 +74,49 @@ airt.to_zarr(store, mode="w", consolidated=True)
6474
```
6575

6676
Now read
77+
6778
```
6879
# consolidated must be False
6980
ds = xr.open_dataset(store, engine="kvikio", consolidated=False)
70-
ds
81+
ds.air
7182
```
7283

84+
```
85+
<xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)>
86+
[3869000 values with dtype=float32]
87+
Coordinates:
88+
* lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
89+
* lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
90+
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
91+
Attributes:
92+
GRIB_id: 11
93+
GRIB_name: TMP
94+
actual_range: [185.16000366210938, 322.1000061035156]
95+
dataset: NMC Reanalysis
96+
level_desc: Surface
97+
long_name: 4xDaily Air temperature at sigma level 995
98+
parent_stat: Other
99+
precision: 2
100+
statistic: Individual Obs
101+
units: degK
102+
var_desc: Air temperature
103+
```
104+
105+
Note that we get Xarray's lazy backend arrays by default, and that dimension coordinate variables `lat`, `lon`, `time` were read. At this point this looks identical to what we get with a standard `xr.open_dataset(store, engine="zarr")` command.
106+
73107
Now load a small subset
74-
``` python
108+
109+
```python
75110
type(ds["air"].isel(time=0, lat=10).load().data)
76111
```
112+
77113
```
78114
cupy._core.core.ndarray
79115
```
80116

81117
Success!
82118

83-
Xarray integrates [decently well](https://cupy-xarray.readthedocs.io/quickstart.html) with CuPy arrays so you should be able to test out analysis pipelines pretty seamlessly.
119+
Xarray integrates [decently well](https://cupy-xarray.readthedocs.io/quickstart.html) with CuPy arrays so you should be able to test out analysis pipelines pretty easily.
84120

85121
## Cool demo
86122

@@ -94,7 +130,7 @@ We demonstrate integrating the Kvikio library using Xarray's new backend entrypo
94130

95131
## Appendix I : Step-by-step install instructions
96132

97-
Wei Ji Leong (@weiji14) helpfully [provided steps](https://discourse.pangeo.io/t/favorite-way-to-go-from-netcdf-xarray-to-torch-tf-jax-et-al/2663/2) to get started on your machine:
133+
[Wei Ji Leong](https://github.com/weiji14) helpfully [provided steps](https://discourse.pangeo.io/t/favorite-way-to-go-from-netcdf-xarray-to-torch-tf-jax-et-al/2663/2) to get started on your machine:
98134

99135
```
100136
# May need to install nvidia-gds first
@@ -123,7 +159,8 @@ jupyter lab --no-browser
123159

124160
## Appendix II : making sure GDS is working
125161

126-
Scott Henderson (@scottyhq) pointed out that running `python kvikio/python/benchmarks/single-node-io.py` prints nice diagnostic information that lets you check whether GDS is set up. Note that on our system, we have "compatibility mode" enabled. So we don't see the benefits now but this was enough to wire everything up.
162+
[Scott Henderson](https://github.com/scottyhq) pointed out that running `python kvikio/python/benchmarks/single-node-io.py` prints nice diagnostic information that lets you check whether GDS is set up. Note that on our system, we have "compatibility mode" enabled. So we don't see the benefits now but this was enough to wire everything up.
163+
127164
```
128165
----------------------------------
129166
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

0 commit comments

Comments
 (0)