- Title: Zarr Extension Specification
- Identifier: https://stac-extensions.github.io/zarr/v1.1.0/schema.json
- Field Name Prefix: zarr
- Scope: Asset
- Extension Maturity Classification: Proposal
- Owner: @jsignell
This document explains the Zarr Extension to the SpatioTemporal Asset Catalog (STAC) specification.
This extension helps users open STAC Assets pointing to Zarr stores. It includes fields from Zarr metadata that are relevant when opening a Zarr store. The goal of this extension is not to reproduce all Zarr metadata within STAC.
This extension takes inspiration from the deprecated Xarray Assets Extension
by @TomAugspurger and can be used as a replacement when paired with the Storage Extension and xpystac
(see Python Example below).
- Examples:
- Item example: Shows the basic usage of the extension in a STAC Item
- Collection example: Shows the basic usage of the extension in a STAC Collection
- JSON Schema
- Changelog
The fields in the table below can be used in these parts of STAC documents:
- Catalogs
- Collections
- Item Properties (incl. Summaries in Collections)
- Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
- Links
Field Name | Type | Description |
---|---|---|
zarr:consolidated | boolean | Whether the Zarr store includes consolidated metadata |
zarr:node_type | string | Type of Zarr hierarchy node element. Must be group or array |
zarr:zarr_format | integer | Zarr format of the store (currently 2 or 3) |
Consolidated metadata
stores all the metadata for a Zarr hierarchy in the metadata of the root Group. This boolean
consolidated
fields is useful when opening a Zarr store in xarray:
xarray.open_dataset(... consolidated=True)
.
A value of true
indicates that:
For Zarr 2: there is a top-level .zmetadata document.
For Zarr 3: within the top-level Zarr metadata there is a consolidated_metadata
field.
Note: Consolidated metadata is not officially part of the Zarr specification (PR to add it), but it is useful to know whether or not it is present when opening a Zarr store.
As defined in the Zarr v3 Specification
A string defining the type of hierarchy node element
node_type
indicates which data model to use when opening the Zarr store. For group
a tree structure is recommended
(such as xarray.DataTree
). For array an array structure (such as xarray.DataArray
). Note that xarray does not
currently support reading Zarr Arrays directly into xarray.DataArray
objects, but other libraries
such as GDAL, zarrs, and zarr-python do.
As defined in the Zarr v3 Specification:
An integer defining the version of the storage specification to which the array store adheres.
zarr_format
has implications with respect to the versions of libraries required to open the Zarr store.
This extension will be used by xpystac to enable the following:
import planetary_computer
import pystac_client
import xarray as xr
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
collection = catalog.get_collection("daymet-daily-hi")
asset = collection.assets["zarr-abfs"]
xr.open_dataset(asset, patch_url=planetary_computer.sign)
All contributions are subject to the STAC Specification Code of Conduct. For contributions, please follow the STAC specification contributing guide Instructions for running tests are copied here for convenience.
The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid.
To run tests locally, you'll need npm
, which is a standard part of any node.js installation.
First you'll need to install everything with npm once. Just navigate to the root of this repository and on your command line run:
npm install
Then to check markdown formatting and test the examples against the JSON schema, you can run:
npm test
This will spit out the same texts that you see online, and you can then go and fix your markdown or examples.
If the tests reveal formatting problems with the examples, you can fix them with:
npm run format-examples