Add example of virtualizing GOES using caching and request splitting #855
+389
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR shows how to cache entire files before virtualizing.
Full explanation: for data formats that use b-link trees (like non-cloud-optimized version of HDF5), you'll probably want to cache the entire file up front. If you also want to load many variables, you'll want to cache the file's contents at the store level rather than at the parser/file reader level so that it's accessible by ManifestStore as well as the parser. This PR adds an example of how to do that using new features in obspec-utils. It relies on the sequence of PRs discussed in #844, which makes VirtualiZarr work with any ReadableStore using duck-type rather than only the stores implemented in obstore.
It reduces the amount of time to virtualize a single GOES-16 file on my laptop from ~47s to ~8s ⚡
docs/releases.rstapi.rst