Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
60fddbc
add: create specifications for Projection Attribute Extension and its…
emmanuelmathot Sep 8, 2025
f09f67a
fix: update attribute naming from `proj` to `geo` in Projection Attri…
emmanuelmathot Sep 9, 2025
83a5f93
fix: remove `shape` field from Projection Attribute Extension schema …
emmanuelmathot Sep 9, 2025
77bae32
fix: rename `spatial_dims` to `spatial_dimensions` in README and sche…
emmanuelmathot Sep 9, 2025
caee06b
fix: enhance spatial dimension identification section in README with …
emmanuelmathot Sep 9, 2025
9daf0d3
fix: clarify inheritance model for `geo:proj` attribute in README
emmanuelmathot Sep 9, 2025
7e77fa1
fix: clarify spatial dimension identification section in README with …
emmanuelmathot Sep 9, 2025
01537ec
fix: add version field to geo:proj schema and update README for compa…
emmanuelmathot Sep 11, 2025
17a0700
fix: update geo:proj schema to include version field and restructure …
emmanuelmathot Sep 14, 2025
289f774
Update attributes/geo:proj/README.md
emmanuelmathot Sep 14, 2025
cd14f53
Update attributes/README.md
emmanuelmathot Sep 14, 2025
0932243
Update attributes/README.md
emmanuelmathot Sep 14, 2025
cd58d28
Update attributes/geo:proj/README.md
emmanuelmathot Sep 14, 2025
c317a4a
Update attributes/geo:proj/README.md
emmanuelmathot Sep 14, 2025
8a60696
Remove redundant examples for irregular grid and update WKT2 represen…
emmanuelmathot Sep 14, 2025
96b5db1
Refine spatial dimension identification section to focus on regular g…
emmanuelmathot Sep 14, 2025
e9e754e
Refactor inheritance model section for clarity and remove redundant r…
emmanuelmathot Sep 14, 2025
1893b7b
Update examples in geo:proj README.md to enhance clarity and provide …
emmanuelmathot Sep 14, 2025
a244d90
Clarify spatial dimension interpretation in geo:proj README.md to emp…
emmanuelmathot Sep 14, 2025
1cffe00
Clarify terminology and improve consistency in geo:proj README.md reg…
emmanuelmathot Sep 14, 2025
d903aa7
Remove the Registered Extensions section from attributes README.md to…
emmanuelmathot Sep 14, 2025
f4364e4
Update attributes/geo:proj/README.md
emmanuelmathot Sep 14, 2025
cc9f913
Enhance README.md and schema.json for geo:proj extension by adding de…
emmanuelmathot Sep 14, 2025
21b0d1c
Refine validation rules in geo:proj README.md to clarify shape infere…
emmanuelmathot Sep 14, 2025
1bde1bc
Update attributes/README.md
emmanuelmathot Sep 17, 2025
190ad60
Update versioning in geo:proj README.md and schema.json to 0.1.0
emmanuelmathot Sep 17, 2025
a4b2bef
Update attributes/geo:proj/README.md
emmanuelmathot Sep 17, 2025
544687d
Update attributes/geo:proj/README.md
emmanuelmathot Sep 17, 2025
4582d2a
Enhance README.md with projection authority examples and clarify CRS …
emmanuelmathot Sep 17, 2025
1c0134f
Remove redundant emphasis on spatial dimensions interpretation in geo…
emmanuelmathot Sep 17, 2025
ac841d8
Add algorithm for resolving spatial dimensions in group-level geo:proj
emmanuelmathot Sep 17, 2025
803f269
Clarify flexibility in spatial dimension ordering in geo:proj README.md
emmanuelmathot Sep 17, 2025
27a1123
Refactor Geo Projection Attribute Extension
emmanuelmathot Sep 21, 2025
eeb64ae
Update attributes/geo/proj/README.md
emmanuelmathot Sep 22, 2025
22b277d
Merge branch 'main' into proj-crs
emmanuelmathot Sep 28, 2025
c84a05e
Update README.md for Geo Projection Attribute Extension and remove sc…
emmanuelmathot Sep 28, 2025
5132326
Remove link to Attributes section in README.md
emmanuelmathot Sep 28, 2025
b13da42
Update attributes/geo/proj/README.md
emmanuelmathot Sep 29, 2025
f9ffc2f
Update attributes/geo/proj/README.md
emmanuelmathot Sep 29, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ It is the normative source for registering names of Zarr v3 extensions.

To register an extension, open a new PR with a new extension directory under the relevant extension point:

* [Attributes](./attributes/README.md)
* [Codecs](./codecs/README.md)
* [Data Types](./data-types/README.md)
* [Chunk Key Encoding](./chunk-key-encodings/README.md)
Expand Down
30 changes: 30 additions & 0 deletions attributes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Attributes Extensions

This directory contains specifications for Zarr v3 attribute extensions.

## What are Attribute Extensions?

Attribute extensions define standardized schemas and semantics for metadata stored in the `attributes` field of Zarr arrays and groups. These extensions enable interoperability by establishing common conventions for domain-specific metadata.

## Registered Extensions

| Extension | Version | Description |
|-----------|---------|-------------|
| [projection](./projection/) | 1.0.0 | Coordinate reference system metadata for geospatial data |

## Creating an Attribute Extension

When creating an attribute extension, consider:

1. **Namespace**: Use a unique prefix to avoid conflicts (e.g., `proj:` for projection)
2. **Schema**: Provide a JSON schema for validation
3. **Inheritance**: Define behavior when attributes are set at group vs array level
4. **Compatibility**: Consider interoperability with existing tools and standards

## Extension Requirements

Each attribute extension MUST:
- Define the attribute key(s) and structure
- Provide a JSON schema for validation
- Include examples of usage
- Document any inheritance or precedence rules
149 changes: 149 additions & 0 deletions attributes/geo:proj/README.md

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If more than 1 of the 3 is provided, they have to be semantically identical (i.e. describe the same CRS).

Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Projection Attribute Extension for Zarr

- **Extension Name**: Projection Attribute Extension
- **Version**: 1.0.0
- **Extension Type**: Attribute
- **Status**: Proposed
- **Owners**: @emmanuelmathot

## Description

This extension defines a standardized way to encode coordinate reference system (CRS) information for geospatial Zarr arrays and groups using the `geo:proj` attribute.

## Motivation

- Provides simple, standardized CRS encoding without complex nested structures
- Addresses issues identified in GeoZarr discussions regarding CF convention complexity
- Compatible with existing geospatial tools (GDAL, rasterio, pyproj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Compatible with existing geospatial tools (GDAL, rasterio, pyproj)
- Future cross-compatibility with existing geospatial tools (GDAL, rasterio, pyproj)

I think this better reflects the current status and goals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know the data model is compatible. We can a section about the tooling implementation status but I'd avoid putting assumptions in the motivation section.

- Based on the proven STAC Projection Extension model

## Specification

The `geo:proj` attribute can be added to Zarr arrays or groups to define projection information.

### Required Fields

At least one of the following MUST be provided:

- `code`: Authority and code identifier (e.g., "EPSG:4326")
- `wkt2`: WKT2 string representation of the CRS
- `projjson`: PROJJSON object representation of the CRS

### Optional Fields

- `bbox`: Bounding box in the CRS coordinates
- `transform`: Affine transformation coefficients (6 or 9 elements)
- `spatial_dims`: Names of spatial dimensions in the array

Note: The shape of spatial dimensions is obtained directly from the Zarr array metadata once the spatial dimensions are identified.

### Spatial Dimension Identification

The extension identifies spatial dimensions through:

1. **Explicit Declaration** (recommended): Use `spatial_dims` to specify dimension names
2. **Convention-Based** (fallback): Automatically detect standard spatial dimension names

#### Explicit Declaration

```json
{
"geo:proj": {
"spatial_dims": ["latitude", "longitude"]
}
}
```

#### Convention-Based Detection

If `spatial_dims` is not provided, implementations should scan `dimension_names` for these patterns (in order):

- ["y", "x"] or ["Y", "X"]
- ["lat", "lon"] or ["latitude", "longitude"]
- ["northing", "easting"]
- ["row", "col"] or ["line", "sample"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be broadened a bit to allow any dimension_names that includes these patterns. That would catch cases where there is also time or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The geo:proj extension specifically scopes to only the spatial dimensions that the CRS applies to (typically 2D: y/x, lat/lon, etc.). Non-spatial dimensions like time, band, or depth are outside the scope of this extension. The pattern matching is designed to identify exactly the spatial dimension pair that corresponds to the CRS, not to handle additional dimensions in the array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just got your comments afterwards. This is not a static pattern that dimension_names must exactly match. The pattern matching rule is a set of possible name that must be found together in dimension_names to match a possible combination


The first matching pair determines the spatial dimensions.

### Validation Rules

- Once spatial dimensions are identified (either explicitly through `spatial_dims` or through convention-based detection), their sizes are obtained from the Zarr array's shape metadata
- The spatial dimension order is always [y/lat/northing, x/lon/easting]
- If spatial dimensions cannot be identified through either method, implementations MUST raise an error
- When multiple CRS representations are provided, precedence is: `projjson` > `wkt2` > `code`

### Shape Reconciliation

The shape of spatial dimensions is determined by:
1. Identifying the spatial dimensions using either `spatial_dims` or convention-based detection
2. Looking up these dimension names in the Zarr array's `dimension_names`
3. Using the corresponding sizes from the array's `shape` attribute

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When talking about "the array" I think it's necessary to separate out the "dimension array" from the "data variable array". My understanding is here the proposal is to compare the shape of all data variable arrays with the shape of any listed dimension arrays

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes indeed but the "non matching" dimension arrays would be skipped if not matching the pattern of the spatial_dimensions defintion


This approach avoids redundancy and ensures consistency by using the array's own metadata rather than duplicating shape information.

## Examples

### Example 1: Simple EPSG Code

```json
{
"zarr_format": 3,
"shape": [2048, 2048],
"dimension_names": ["y", "x"],
"attributes": {
"geo:proj": {
"code": "EPSG:3857"
}
}
}
```

### Example 2: With Multiple Dimensions and Transform

```json
{
"zarr_format": 3,
"shape": [365, 100, 2048, 2048, 4],
"dimension_names": ["time", "height", "latitude", "longitude", "band"],
"attributes": {
"geo:proj": {
"code": "EPSG:4326",
"spatial_dims": ["latitude", "longitude"],
"transform": [0.1, 0.0, -180.0, 0.0, -0.1, 90.0],
"bbox": [-180.0, -90.0, 180.0, 90.0]
}
}
}
```

### Example 3: WKT2 Representation

```json
{
"zarr_format": 3,
"shape": [1000, 1000],
"dimension_names": ["northing", "easting"],
"attributes": {
"geo:proj": {
"wkt2": "PROJCRS[\"WGS 84 / UTM zone 33N\",BASEGEOGCRS[\"WGS 84\",DATUM[\"World Geodetic System 1984\",ELLIPSOID[\"WGS 84\",6378137,298.257223563,LENGTHUNIT[\"metre\",1]]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433]]],CONVERSION[\"UTM zone 33N\",METHOD[\"Transverse Mercator\",ID[\"EPSG\",9807]],PARAMETER[\"Latitude of natural origin\",0,ANGLEUNIT[\"degree\",0.0174532925199433]],PARAMETER[\"Longitude of natural origin\",15,ANGLEUNIT[\"degree\",0.0174532925199433]],PARAMETER[\"Scale factor at natural origin\",0.9996,SCALEUNIT[\"unity\",1]],PARAMETER[\"False easting\",500000,LENGTHUNIT[\"metre\",1]],PARAMETER[\"False northing\",0,LENGTHUNIT[\"metre\",1]]],CS[Cartesian,2],AXIS[\"easting\",east,ORDER[1],LENGTHUNIT[\"metre\",1]],AXIS[\"northing\",north,ORDER[2],LENGTHUNIT[\"metre\",1]]]",
"transform": [30.0, 0.0, 500000.0, 0.0, -30.0, 5000000.0]
}
}
}
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these examples are for arrays even though it is recommended that this be defined at the group level. Are we sure we want to allow inheritance from the group level?

I very much like @benbovy's suggestion (#21 (comment)) that the "geo:proj" blobs be defined at that group level with an id and then the arrays reference a specific id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still a bit confused with the "dimension array" term you often use. Are they coordinates? In that case, this is out of scope. We want to keep the spec on top of the base Zarr concepts (arrays and shapes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsignell I think I finally understood your point here and I updated the readme to better describe how the spec should interpret at array-level.


## Inheritance

When `geo:proj` is defined at the group level, it applies to all arrays within that group unless overridden at the array level.

## Compatibility Notes

- The `code` field follows the "authority:code" format used by PROJ library
- The `wkt2` field should conform to OGC WKT2 (ISO 19162) standard
- The `transform` field follows the same ordering as GDAL's GeoTransform and STAC's projection extension

## References

- [STAC Projection Extension v2.0.0](https://github.com/stac-extensions/projection)
- [PROJJSON Specification](https://proj.org/specifications/projjson.html)
- [OGC WKT2 Standard](https://www.ogc.org/standards/wkt-crs)
98 changes: 98 additions & 0 deletions attributes/geo:proj/schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://zarr-specs.readthedocs.io/en/latest/extensions/attributes/projection/v1.0.0/schema.json",
"title": "Zarr Projection Attribute Extension",
"description": "Projection attribute extension for Zarr arrays and groups",
"type": "object",
"definitions": {
"projectionMetadata": {
"type": "object",
"properties": {
"code": {
"type": ["string", "null"],
"description": "Authority:code identifier (e.g., EPSG:4326)",
"pattern": "^[A-Z]+:[0-9]+$"
},
"wkt2": {
"type": ["string", "null"],
"description": "WKT2 (ISO 19162) CRS representation"
},
"projjson": {
"oneOf": [
{
"$ref": "https://proj.org/schemas/v0.7/projjson.schema.json"
},
{
"type": "null"
}
],
"description": "PROJJSON CRS representation"
},
"bbox": {
"type": "array",
"oneOf": [
{
"minItems": 4,
"maxItems": 4
},
{
"minItems": 6,
"maxItems": 6
}
],
"items": {
"type": "number"
},
"description": "Bounding box in CRS coordinates"
},
"transform": {
"type": "array",
"oneOf": [
{
"minItems": 6,
"maxItems": 6
},
{
"minItems": 9,
"maxItems": 9
}
],
"items": {
"type": "number"
},
"description": "Affine transformation coefficients"
},
"spatial_dims": {
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": {
"type": "string"
},
"description": "Names of spatial dimensions [y_name, x_name]"
}
},
"oneOf": [
{
"required": ["code"]
},
{
"required": ["wkt2"]
},
{
"required": ["projjson"]
}
]
}
},
"properties": {
"attributes": {
"type": "object",
"properties": {
"geo:proj": {
"$ref": "#/definitions/projectionMetadata"
}
}
}
}
}