|
| 1 | +--- |
| 2 | +title: "Enabling GPU-native analytics with Xarray and Kvikio" |
| 3 | +date: "2022-06-09" |
| 4 | +authors: |
| 5 | + - name: Deepak Cherian |
| 6 | + github: dcherian |
| 7 | +summary: "An experiment with direct-to-GPU reads from a Zarr store using Xarray." |
| 8 | +--- |
| 9 | + |
| 10 | +## TLDR |
| 11 | + |
| 12 | +We [demonstrate](https://github.com/xarray-contrib/cupy-xarray/pull/10) registering an Xarray backend that reads data from a Zarr store directly to GPU memory as [CuPy arrays](https://cupy.dev) using the new [Kvikio library](https://docs.rapids.ai/api/kvikio/stable/) and [GPU Direct Storage](https://developer.nvidia.com/blog/gpudirect-storage/) technology. |
| 13 | + |
| 14 | +## Background |
| 15 | + |
| 16 | +### What is GPU Direct Storage |
| 17 | + |
| 18 | +Insert https://developer.nvidia.com/blog/wp-content/uploads/2019/08/GPUDirect-Fig-1-New.png somehow |
| 19 | + |
| 20 | +### What is Kvikio |
| 21 | + |
| 22 | +> kvikIO is a Python library providing bindings to cuFile, which enables GPUDirectStorage (GDS). |
| 23 | +
|
| 24 | +For Xarray, the key bit is that kvikio exposes a zarr store [kvikio.zarr.GDSStore](https://docs.rapids.ai/api/kvikio/stable/api.html#zarr) that does all the hard work for us. Since Xarray knows how to read Zarr stores, we can adapt that in a new storage backend. And thanks to recent work funded by the Chan Zuckerberg Initiative, adding a [new backend](https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html) is quite easy! |
| 25 | + |
| 26 | +## Integrating with Xarray |
| 27 | + |
| 28 | +Getting all this to work nicely requires using three in-progress pull requests that |
| 29 | +1. [Teach Zarr to handle alternative array classes](https://github.com/zarr-developers/zarr-python/pull/934) |
| 30 | +2. [Rewrite a small bit of Xarray to not cast all data to a numpy array after read from disk](https://github.com/pydata/xarray/pull/6874) |
| 31 | +3. [Make a backend that connects Xarray to Kvikio](https://github.com/xarray-contrib/cupy-xarray/pull/10) |
| 32 | + |
| 33 | +Writing the backend for Xarray was relatively easily. Most of the code was copied over from the existing Zarr backend. Most of the effort was in ensuring that dimension coordinates could be read in directly to host memory without raising an error. This is required because Xarrays creates `pandas.Index` objects for such variables. In the future, we could consider using `cudf.Index` instead to allow a fully GPU-backed Xarray object. |
| 34 | + |
| 35 | +## Usage |
| 36 | + |
| 37 | +Assuming you have all the pieces together (see [Appendix I]() and [Appendix II]() for step-by-step instructions), then using all this cool technology only requires adding `engine="kvikio"` to your `open_dataset` line (!) |
| 38 | + |
| 39 | +``` python |
| 40 | +import xarray as xr |
| 41 | + |
| 42 | +ds = xr.open_dataset("file.zarr", engine="kvikio", consolidated=False) |
| 43 | +``` |
| 44 | + |
| 45 | +With this `ds.load()` will load directly to GPU memory and `ds` will now contain CuPy arrays. |
| 46 | + |
| 47 | +At present there are a few limitations: |
| 48 | +1. stores cannot be read with consolidated metadata, and |
| 49 | +2. compression is unsupported by the backend. |
| 50 | + |
| 51 | +## Quick demo |
| 52 | + |
| 53 | +First create an example uncompressed dataset to read from |
| 54 | +``` |
| 55 | +import xarray as xr |
| 56 | +
|
| 57 | +store = "./air-temperature.zarr" |
| 58 | +
|
| 59 | +airt = xr.tutorial.open_dataset("air_temperature", engine="netcdf4") |
| 60 | +
|
| 61 | +for var in airt.variables: |
| 62 | + airt[var].encoding["compressor"] = None |
| 63 | +airt.to_zarr(store, mode="w", consolidated=True) |
| 64 | +``` |
| 65 | + |
| 66 | +Now read |
| 67 | +``` |
| 68 | +# consolidated must be False |
| 69 | +ds = xr.open_dataset(store, engine="kvikio", consolidated=False) |
| 70 | +ds |
| 71 | +``` |
| 72 | + |
| 73 | +Now load a small subset |
| 74 | +``` python |
| 75 | +type(ds["air"].isel(time=0, lat=10).load().data) |
| 76 | +``` |
| 77 | +``` |
| 78 | +cupy._core.core.ndarray |
| 79 | +``` |
| 80 | + |
| 81 | +Success! |
| 82 | + |
| 83 | +Xarray integrates [decently well](https://cupy-xarray.readthedocs.io/quickstart.html) with CuPy arrays so you should be able to test out analysis pipelines pretty seamlessly. |
| 84 | + |
| 85 | +## Cool demo |
| 86 | + |
| 87 | +We don't have a cool demo yet but are looking to develop one very soon! |
| 88 | + |
| 89 | +Reach out if you have ideas. We would love to hear from you. |
| 90 | + |
| 91 | +## Summary |
| 92 | + |
| 93 | +We demonstrate integrating the Kvikio library using Xarray's new backend entrypoints. With everything set up, simply adding `engine="kvikio"` enables direct-to-GPU reads from disk or over the network. |
| 94 | + |
| 95 | +## Appendix I : Step-by-step install instructions |
| 96 | + |
| 97 | +Wei Ji Leong (@weiji14) helpfully [provided steps](https://discourse.pangeo.io/t/favorite-way-to-go-from-netcdf-xarray-to-torch-tf-jax-et-al/2663/2) to get started on your machine: |
| 98 | + |
| 99 | +``` |
| 100 | +# May need to install nvidia-gds first |
| 101 | +# https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#ubuntu-installation-common |
| 102 | +sudo apt install nvidia-gds |
| 103 | +
|
| 104 | +git clone https://github.com/dcherian/cupy-xarray.git |
| 105 | +cd cupy-xarray |
| 106 | +
|
| 107 | +mamba create --name cupy-xarray python=3.9 cupy=11.0 rapidsai-nightly::kvikio=22.10 jupyterlab=3.4.5 pooch=1.6.0 netcdf4=1.6.0 watermark=2.3.1 |
| 108 | +mamba activate cupy-xarray |
| 109 | +python -m ipykernel install --user --name cupy-xarray |
| 110 | +
|
| 111 | +# https://github.com/pydata/xarray/pull/6874 |
| 112 | +pip install git+https://github.com/dcherian/xarray.git@kvikio |
| 113 | +# https://github.com/zarr-developers/zarr-python/pull/934 |
| 114 | +pip install git+https://github.com/madsbk/zarr-python.git@cupy_support |
| 115 | +# https://github.com/xarray-contrib/cupy-xarray/pull/10 |
| 116 | +git switch kvikio-entrypoint |
| 117 | +pip install --editable=. |
| 118 | +
|
| 119 | +# Start jupyter lab |
| 120 | +jupyter lab --no-browser |
| 121 | +# Then open the docs/kvikio.ipynb notebook |
| 122 | +``` |
| 123 | + |
| 124 | +## Appendix II : making sure GDS is working |
| 125 | + |
| 126 | +Scott Henderson (@scottyhq) pointed out that running `python kvikio/python/benchmarks/single-node-io.py` prints nice diagnostic information that lets you check whether GDS is set up. Note that on our system, we have "compatibility mode" enabled. So we don't see the benefits now but this was enough to wire everything up. |
| 127 | +``` |
| 128 | +---------------------------------- |
| 129 | +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| 130 | + WARNING - KvikIO compat mode |
| 131 | + libcufile.so not used |
| 132 | +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
| 133 | +GPU | Quadro GP100 (dev #0) |
| 134 | +GPU Memory Total | 16.00 GiB |
| 135 | +BAR1 Memory Total | 256.00 MiB |
| 136 | +GDS driver | N/A (Compatibility Mode) |
| 137 | +GDS config.json | /etc/cufile.json |
| 138 | +``` |
0 commit comments