-
Notifications
You must be signed in to change notification settings - Fork 68
Open
Description
I don't know anything really about the format of MITgcm output files other than that they are some bespoke binary format, but I can't help wonder if it would actually be easier to create a cloud-optimized version of MITgcm data by writing a reader for virtualizarr (i.e. a kerchunk reader) rather than actually converting the binary data to zarr.
The advantages would be that
- if you want to make the data available to xarray users, even in the cloud, you don't have to alter or duplicate the original data (for cloud access you could just upload the original output files to a bucket with no alterations),
- that reader would work for any MITgcm output (so effectively replacing most of xMITgcm),
- it would mean that creating the over-arching actual virtual zarr store becomes the same problem that everyone else has (that the rest of the virtualizarr package is meant to solve).
It would involve essentially rewriting this function
Line 87 in 63ba751
def read_mds(fname, iternum=None, use_mmap=None, endian='>', shape=None, |
to look like either one of the kerchunk readers or ideally more like this
zarr-developers/VirtualiZarr#113
Because it seems MITgcm output already separates metadata from data to some degree this could potentially work really nicely...
See also zarr-developers/VirtualiZarr#218
One downside of that approach would be the inability to alter the chunking though.
Metadata
Metadata
Assignees
Labels
No labels