Skip to content

Consistent versioning approaches across Zarr Conventions comprising GeoZarr #102

@maxrjones

Description

@maxrjones

Summary

#101 Established the the first release of GeoZarr will be comprised of a set of initial Zarr conventions(e.g., multiscales, spatial:, proj:).

Image

To ensure a coherent ecosystem and reduce implementation burden, we should align on a consistent versioning strategy across all conventions that comprise GeoZarr.

Background

Recent discussions across the zarr-conventions organization have explored versioning approaches for specifications:

Key Considerations

Integer vs. Semantic Versioning

For specifications (as opposed to software libraries), semantic versioning (major.minor.patch) may create unnecessary burden on implementers.

Consistency Across Conventions

If different conventions within GeoZarr use different versioning schemes (e.g., multiscales using v1.0.0 while others use v1 and others do not use versioning), this could:

  • Confuse implementers about version compatibility semantics
  • Create inconsistent patterns in zarr metadata
  • Complicate tooling that needs to parse and validate version strings

Implementer Burden

Large geospatial data catalogs are expensive to update. A versioning strategy should:

  • Signal meaningful changes that require action
  • Allow forward compatibility where possible (e.g., new optional fields)
  • Avoid requiring catalog-wide updates for non-breaking changes

Proposal

GeoZarr should establish a recommendation for consistent versioning across its constituent conventions:

  1. Adopt integer versioning (e.g., v1, v2) for all Zarr Conventions comprising GeoZarr, aligning with zarr-conventions-spec and zarr_format
  2. Start at version 1 to signal stability and production-readiness
  3. Increment versions only for breaking changes that require implementation updates
  4. Use GitHub releases for tracking non-breaking updates (documentation, clarifications, optional field additions) without incrementing the convention version
  5. Allow updating schemas for versions that do not include breaking changes - This would mean not using tagged release files as the published schema.

Alternatives:

  1. Include SemVer, providing recommendations for how it is used for (STAC style)
  2. Recommend no versioning, instead stating that breaking changes MUST mean the data do not actually follow the conventions (GeoJSON style)
  3. Use only major + minor versioning in the metadata (CF Style)

Questions for Discussion

  1. Should GeoZarr mandate a versioning approach for its constituent conventions, or provide a recommendation?
  2. What do folks think of this proposal?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions