diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index 8af7598..4b0d0fd 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -50,7 +50,7 @@ The unified data model supports CF-compliant metadata, including attributes such To support additional capabilities, the model defines optional extension points referencing external OGC and community standards: -- **OGC Tile Matrix Set** – Facilitates the definition of multiscale grid hierarchies for raster overviews. +- **OGC Tile Matrix Set** – Facilitates the definition of multiscale grid hierarchies for raster overviews using arbitrary coordinate reference systems, including custom tile matrix sets for scientific projections beyond web mapping schemes. - **GDAL Geotransform** – Enables geospatial referencing through affine transformations and optional interpolation specifications. - **STAC Metadata (Collection and Item)** – Provides linkage to SpatioTemporal Asset Catalogs for resource discovery and indexing. @@ -351,4 +351,3 @@ The unified data model facilitates interoperability with tools and libraries acr - *Cloud-native infrastructure*: support for parallel access, chunked storage, and hierarchical grouping compatible with object storage. Tooling support is expected to grow via standard-conformant implementations, easing adoption across domains and infrastructures. - diff --git a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc index b20092e..c6bfaea 100644 --- a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc +++ b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc @@ -7,7 +7,26 @@ Multiscale datasets are composed of a set of Zarr groups representing multiple z ==== Hierarchical Layout -Each zoom level SHALL be represented as a Zarr group, identified by the Tile Matrix identifier (e.g., `"0"`, `"1"`, `"2"`). These groups SHALL be organised hierarchically under a common multiscale root group. Each zoom-level group SHALL contain the complete set of variables (Zarr arrays) corresponding to that resolution. +Each zoom level SHALL be represented as a Zarr group, identified by the Tile Matrix identifier specified in the associated TileMatrixSet (e.g., `"0"`, `"1"`, `"2"`). These groups SHALL be organised hierarchically under a common multiscale root group containing the multiscales metadata attribute. Each zoom-level group SHALL be a Dataset (as defined in Section 7.4.1) and SHALL contain the complete set of variables (Zarr arrays) corresponding to that resolution. All zoom level groups SHALL have the same member keys to ensure structural consistency across resolutions. + +The multiscale root group SHALL contain a multiscales attribute that defines the TileMatrixSet reference. Child groups representing zoom levels SHALL use group names that exactly match the TileMatrix identifier values from the referenced TileMatrixSet. The presence and naming of zoom level groups is determined by the tileMatrices array in the TileMatrixSet definition. + +Example hierarchical structure: +---- +/measurements/r10m/ # Multiscale root group with multiscales metadata +├── 0/ # Native resolution (zoom level 0) +│ ├── band1 # Data variable at zoom level 0 +│ ├── band2 # Data variable at zoom level 0 +│ └── spatial_ref # Coordinate reference variable +├── 1/ # First overview level +│ ├── band1 # Data variable at zoom level 1 +│ ├── band2 # Data variable at zoom level 1 +│ └── spatial_ref # Coordinate reference variable +└── 2/ # Second overview level + ├── band1 # Data variable at zoom level 2 + ├── band2 # Data variable at zoom level 2 + └── spatial_ref # Coordinate reference variable +---- [cols="1,2,2"] |=== @@ -20,7 +39,7 @@ Each zoom level SHALL be represented as a Zarr group, identified by the Tile Mat |Global metadata | `multiscales` defined in parent `.zattrs` | `multiscales` defined in parent group `zarr.json` under `attributes` |=== -Each multiscale group MUST define chunking (tiling) along the spatial dimensions (`X`, `Y`, or `lon`, `lat`). Recommended chunk sizes are 256×256 or 512×512. +Each multiscale group SHALL define chunking (tiling) along the spatial dimensions (`X`, `Y`, or `lon`, `lat`). Recommended chunk sizes are 256×256 or 512×512. ==== Metadata Encoding @@ -30,18 +49,73 @@ Multiscale metadata SHALL be defined using a `multiscales` attribute located in - `resampling_method` – One of the standard string values (e.g., `"nearest"`, `"average"`) - `tile_matrix_set_limits` – (optional) Zoom-level limits following the STAC Tiled Asset style -===== Zarr v2 Encoding Example (`.zattrs`) +The multiscales metadata enables complete discovery of the multiscale collection structure: +- The TileMatrixSet definition (whether referenced by identifier or included inline) specifies + the exact set of zoom levels through its tileMatrices array +- Each TileMatrix.id value corresponds to a required child group in the multiscale hierarchy +- Variable discovery within each zoom level group follows standard Zarr metadata conventions + +===== Examples of TileMatrixSet Reference Types + +The tile_matrix_set member can be specified in three ways: + [source,json] ---- +# 1. String identifier reference { "multiscales": { "tile_matrix_set": "WebMercatorQuad", "resampling_method": "nearest" } } + +# 2. URI reference +{ + "multiscales": { + "tile_matrix_set": "https://maps.example.org/tileMatrixSets/WebMercatorQuad.json", + "resampling_method": "nearest" + } +} + +# 3. Inline object +{ + "multiscales": { + "tile_matrix_set": { + "id": "Custom_Grid", + "title": "Custom Grid for Scientific Data", + "crs": "EPSG:4326", + "tileMatrices": [ + { + "id": "0", + "scaleDenominator": 0.703125, + "cellSize": 0.0625, + "pointOfOrigin": [-180.0, 90.0], + "tileWidth": 256, + "tileHeight": 256, + "matrixWidth": 2, + "matrixHeight": 1 + } + ] + }, + "resampling_method": "nearest" + } +} ---- -===== Zarr v3 Encoding Example (`zarr.json`) +===== Zarr v2 and v3 Encoding + +For Zarr v2, the multiscales metadata is stored in `.zattrs`: +[source,json] +---- +{ + "multiscales": { + "tile_matrix_set": "WebMercatorQuad", + "resampling_method": "nearest" + } +} +---- + +For Zarr v3, it is stored in `zarr.json`: [source,json] ---- { @@ -56,6 +130,58 @@ Multiscale metadata SHALL be defined using a `multiscales` attribute located in } ---- +===== Group Contents Discovery + +For storage backends that do not support directory listing, the multiscale group structure can be discovered through the TileMatrixSet definition, regardless of how it is referenced: + +- When tile_matrix_set is a string identifier, the zoom level groups correspond to the TileMatrix identifiers defined in the referenced well-known TileMatrixSet +- When tile_matrix_set is an inline object, the zoom level groups correspond to the id values in the tileMatrices array +- When tile_matrix_set is a URI, implementations SHALL be able to retrieve and parse the referenced TileMatrixSet definition to determine the zoom level group structure. Storage backends that do not support directory listing SHALL still support URI references by implementing TileMatrixSet retrieval and caching as needed. +- Variable names within each zoom level group are discoverable through standard Zarr group metadata mechanisms + +Example with WebMercatorQuad covering zoom levels 7 to 15: + +[source,json] +---- +{ + "multiscales": { + "tile_matrix_set": "WebMercatorQuad", + "resampling_method": "average" + } +} +---- + +This metadata declaration implies the following Zarr group structure: + +---- +/satellite_imagery/ +├── 7/ # Zoom level 7 (TileMatrix id "7") +│ ├── red_band +│ ├── green_band +│ └── blue_band +├── 8/ # Zoom level 8 (TileMatrix id "8") +│ ├── red_band +│ ├── green_band +│ └── blue_band +├── 9/ # Zoom level 9 (TileMatrix id "9") +│ ├── red_band +│ ├── green_band +│ └── blue_band +... +├── 14/ # Zoom level 14 (TileMatrix id "14") +│ ├── red_band +│ ├── green_band +│ └── blue_band +└── 15/ # Zoom level 15 (TileMatrix id "15") + ├── red_band + ├── green_band + └── blue_band +---- + +The tile_matrix_set_limits attribute SHALL be used to explicitly declare which zoom levels contain data. For storage backends that do not support directory listing, this is the primary mechanism for discovering available zoom levels without attempting to access each possible group. If tile_matrix_set_limits is not provided, implementations SHALL assume that all zoom levels defined in the TileMatrixSet are potentially present, but individual zoom level groups MAY be absent if they contain no data. + +The multiscales metadata completely specifies the multiscale-relevant contents through its TileMatrixSet reference, eliminating ambiguity about which groups participate in the multiscale collection. + ==== Tile Matrix Set Representation The `tile_matrix_set` member MAY take one of the following forms: @@ -64,15 +190,109 @@ The `tile_matrix_set` member MAY take one of the following forms: - A URI pointing to a JSON document describing the tile matrix set - An inline JSON object (CamelCase, OGC TMS 2.0 compatible) -Zoom level identifiers in the tile matrix set MUST match the names of the child groups. The spatial reference system declared in `supportedCRS` MUST match the one declared in the corresponding `grid_mapping` of the data variables. +Zoom level identifiers in the tile matrix set SHALL match the names of the child groups. The spatial reference system declared in `supportedCRS` SHALL match the one declared in the corresponding `grid_mapping` of the data variables. + +The group names in the multiscale hierarchy SHALL correspond exactly to the TileMatrix identifier values in the referenced TileMatrixSet. This provides a deterministic mapping between the TileMatrixSet definition and the Zarr group structure. + +For well-known TileMatrixSets referenced by string identifier, implementations SHALL create groups matching the TileMatrix identifiers defined in the standard TileMatrixSet definition. + +Additional groups or arrays MAY be present in the multiscale root group alongside the zoom level groups, but they SHALL NOT use names that conflict with TileMatrix identifiers from the referenced TileMatrixSet. + +===== Custom Tile Matrix Sets for Scientific Coordinate Systems + +The GeoZarr specification explicitly supports custom TileMatrixSet definitions for arbitrary coordinate reference systems, encouraging preservation of native CRS in Earth observation data. The tile_matrix_set member SHOULD accommodate scientific projections including UTM zones, polar stereographic, sinusoidal, and other non-web coordinate systems. + +For custom coordinate systems, the tile_matrix_set SHALL be defined as an inline JSON object following the OGC TileMatrixSet v2.0 specification: + +[source,json] +---- +{ + "multiscales": { + "tile_matrix_set": { + "id": "UTM_Zone_33N_Custom", + "title": "UTM Zone 33N for Sentinel-2 native resolution", + "crs": "EPSG:32633", + "orderedAxes": ["E", "N"], + "tileMatrices": [ + { + "id": "0", + "scaleDenominator": 35.28, + "cellSize": 10.0, + "pointOfOrigin": [299960.0, 9000000.0], + "tileWidth": 1024, + "tileHeight": 1024, + "matrixWidth": 1094, + "matrixHeight": 1094 + }, + { + "id": "1", + "scaleDenominator": 70.56, + "cellSize": 20.0, + "pointOfOrigin": [299960.0, 9000000.0], + "tileWidth": 512, + "tileHeight": 512, + "matrixWidth": 547, + "matrixHeight": 547 + } + ] + }, + "resampling_method": "average" + } +} +---- + +This approach enables accurate scale denominator calculations and chunking strategies optimized for native coordinate systems. + +===== Decimation Requirements and Custom Scaling + +While the OGC TileMatrixSet specification commonly assumes quadtree decimation (scaling by factor of 2 between zoom levels), custom TileMatrixSets MAY use alternative decimation factors for specialized applications. The GeoZarr specification supports arbitrary decimation schemes as defined in the TileMatrixSet. + +Custom decimation factors include: +- Factor of 2 (quadtree): Standard web mapping approach where each zoom level has 4x more tiles +- Factor of 3 (nonary tree): Each zoom level has 9x more tiles, useful for certain scientific gridding schemes +- Other integer factors: Application-specific requirements may dictate alternative decimation + +When using non-standard decimation factors, the TileMatrixSet definition SHALL explicitly specify the matrixWidth and matrixHeight values for each TileMatrix to ensure correct spatial alignment and resolution relationships. Implementations SHALL NOT assume factor-of-2 scaling between zoom levels unless explicitly defined in the TileMatrixSet. + +Example with factor-of-3 decimation: +[source,json] +---- +{ + "id": "Custom_Nonary_Grid", + "crs": "EPSG:4326", + "tileMatrices": [ + { + "id": "0", + "matrixWidth": 1, + "matrixHeight": 1, + "tileWidth": 256, + "tileHeight": 256 + }, + { + "id": "1", + "matrixWidth": 3, + "matrixHeight": 3, + "tileWidth": 256, + "tileHeight": 256 + }, + { + "id": "2", + "matrixWidth": 9, + "matrixHeight": 9, + "tileWidth": 256, + "tileHeight": 256 + } + ] +} +---- ==== Chunk Layout Alignment At each zoom level, chunking SHALL match the tile layout defined by the TileMatrix: -- Chunks MUST be aligned with the tile grid (1:1 mapping between chunks and tiles) -- Chunk sizes MUST match the `tileWidth` and `tileHeight` declared in the TileMatrix -- Spatial dimensions MUST be clearly identified using `dimension_names` (v3) or `_ARRAY_DIMENSIONS` (v2) +- Chunks SHALL be aligned with the tile grid (1:1 mapping between chunks and tiles) +- Chunk sizes SHALL match the `tileWidth` and `tileHeight` declared in the TileMatrix +- Spatial dimensions SHALL be clearly identified using `dimension_names` (v3) or `_ARRAY_DIMENSIONS` (v2) ==== Tile Matrix Set Limits @@ -91,11 +311,12 @@ Example: } ---- +When tile_matrix_set_limits are specified, the TileMatrix identifier keys SHALL match exactly the zoom level group names in the Zarr hierarchy. This ensures consistent referencing between the TileMatrixSet definition, tile limits, and the physical group structure. + ==== Resampling Method -The `resampling_method` MUST indicate the method used for downsampling across zoom levels. The value MUST be one of: +The `resampling_method` SHALL indicate the method used for downsampling across zoom levels. The value SHALL be one of: `nearest`, `average`, `bilinear`, `cubic`, `cubic_spline`, `lanczos`, `mode`, `max`, `min`, `med`, `sum`, `q1`, `q3`, `rms`, `gauss` -The same method MUST apply across all levels. - +The same method SHALL apply across all levels.