You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/usage.md
+10-2Lines changed: 10 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -228,8 +228,16 @@ You can see that the dataset contains a mixture of virtual variables backed by `
228
228
Loading variables can be useful in a few scenarios:
229
229
1. You need to look at the actual values of a multi-dimensional variable in order to decide what to do next,
230
230
2. You want in-memory indexes to use with ``xr.combine_by_coords``,
231
-
3. Storing a variable on-disk as a set of references would be inefficient, e.g. because each chunk is very small (saving the values like this is similar to kerchunk's concept of "inlining" data),
232
-
4. The variable has complicated encoding, and the simplest way to decode it correctly is to let xarray's standard decoding machinery load it into memory and apply the decoding.
231
+
3. Storing a variable on-disk as a set of references would be inefficient, e.g. because it's a very small array (saving the values like this is similar to kerchunk's concept of "inlining" data),
232
+
4. The variable has encoding, and the simplest way to decode it correctly is to let xarray's standard decoding machinery load it into memory and apply the decoding,
233
+
5. Some of your variables have inconsistent-length chunks, and you want to be able to concatenate them together. For example you might have multiple virtual datasets with coordinates of inconsistent length (e.g., leap years within multi-year daily data).
234
+
235
+
### Loading low-dimensional coordinates
236
+
237
+
In general, it is recommended to load all of your low-dimensional coordinates.
238
+
This will slow down your initial opening of the individual virtual datasets, but by loading your coordinates into memory, they can be inlined in the reference file for fast reads of the virtualized store.
239
+
However, doing this for coordinates that are N-dimensional might use a lot of storage duplicating them.
240
+
Also, anything duplicated could become out of sync with the referenced original files, especially if not using a transactional storage engine like `Icechunk`.
0 commit comments