Skip to content

Conversation

@abarciauskas-bgse
Copy link
Collaborator

@abarciauskas-bgse abarciauskas-bgse commented Feb 6, 2025

This is still very much a WIP - many tests and implementations still need to be fixed.

A few notes:

  • It was suggested we remove ZArray completely as a part of this work, as opposed to using a conversion function for ZArrays to ArrayV3Metadata. So we should be able to remove ZArray as a part of this pr.
  • It was suggested not to use zarr's _parse_chunk_encoding_v3 function since it is a private function and may change, which is why some of that logic is replicated in convert_to_codec_pipeline

Checklist

  • Closes ManifestArray should use zarr-python's ArrayV3Metadata #424
  • Manifest tests passing
  • Library (codecs, etc) tests passing
  • Reader tests passing
  • test_integration tests passing
  • test_xarray tests passing
  • Writer tests passing
  • Cleanup dead code
  • Consider reorganizing codecs and zarr modules
  • Full type hint coverage
  • Tests added for new functions
  • Changes are documented in docs/releases.rst
  • New functions/methods are listed in api.rst
  • New functionality has documentation

@TomNicholas TomNicholas added zarr-python Relevant to zarr-python upstream internals labels Feb 6, 2025
A list of xarray variables.
"""
# This chunk determination logic mirrors zarr-python's create
# https://github.com/zarr-developers/zarr-python/blob/main/zarr/creation.py#L62-L66
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this comment because I think the reference is from a previous version of zarr-python - @sharkinsspatial do you know if we can and should include an updated link?

@abarciauskas-bgse
Copy link
Collaborator Author

@TomNicholas this PR is ready for re-review and 🤞🏽 hopefully good to merge (to zarr-python-3.0). There are a few outstanding comments that are mostly related to creating new issues:

And then I might also create a ticket to test this branch against all (or most) current examples.

Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @abarciauskas-bgse ! Let's just get this merged into a stable branch so that others can build on it.

@abarciauskas-bgse abarciauskas-bgse merged commit e9a4cce into zarr-python-3.0 Feb 18, 2025
2 checks passed
@norlandrhagen
Copy link
Collaborator

Awesome work @abarciauskas-bgse!

@abarciauskas-bgse abarciauskas-bgse deleted the manifest-arrays-use-arrayv3metadata branch March 3, 2025 22:14
maxrjones pushed a commit that referenced this pull request Mar 7, 2025
* Manifest arrays use arrayv3metadata (#429)

* Added zarray_to_v3metadata and test

* Working on manifest array tests

* Fix test_manifests/test_array#TestConcat tests

* Passing TestStack tests and add fixture

* All test_manifests/test_array tests passing

* Compressors should be list

* Passing dmrpp tests

* Passing test_hdf.py tests

* Start to work on kerchunk tests

* Add method to convert array v3 metadata to v2 metadata for kerchunk (not happy about this)

* Fix fixtures and mark xfail netcdf3

* Test for convert_v3_to_v2_metadata

* Deduplicate fixture for array v3 metadata

* Parse filters and compressors from v3 metdata for v2 metadata

* Rewrite extract_codecs

* Refactor convert_to_codec_pipeline

* Fix hdf integration tests

* Test for convert_to_codec_pipeline

* Refactor get_codecs and its tests

* Fix most integration tests and writer tests

* Fix xarray tests

* Working on integration tests

* Add expected type

* Mark datetime tests xfail

* Upgrade xarray for tests

* xfail some unsupported zarr-python 3 data types

* Require zarr

* Remove zarr dep

* import zarr, explicit dependency

Co-authored-by: Tom Nicholas <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add zarr as a dependency

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Min numcodecs version

* numcodecs>=0.15.1 in environment and upstream.yml conda env files

* Working on mypy errors

* Fix mypy errors and tests

* Remove ZArray class

* Just return metadata's shape

* Create update metadata function

* Fix typing for update_metadata

* Check for regular chunk grid in manifest instantiation

* Remove obsolete codecs code

* Fix chunks function and add docstring

* Remove custom zattrs type

* Move some imports and make update_metadata a private method

* Remove zarr.py

* Add zarr to other ci env files

* Fixture array_v3_metadata uses array_v3_metadata_dict

* No need for union type for CodecPipeline

* Use type alias

* Add comment

* Update virtualizarr/manifests/array_api.py

Co-authored-by: Tom Nicholas <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revised copy_and_replace_metadata to be in utils and called correctly

* Update virtualizarr/translators/kerchunk.py

Co-authored-by: Tom Nicholas <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor create v3 array metadata

* Rename to create_v3_array_metadata

* Fix some codecs fixtures

* Use global vars and simple fixture for creating codec pipelines

* Remove redundant create_codec_pipeline fixture

* Fix docstring

* Use create_v3_array_metadata in from_kerchunk_refs

* Add links to zarr-python 3.0 issues for big endian, datetime and timedelta data types

* Reorganize conftest

* Remove obsolete comment

* Rename function numcodec_config_to_configurable

* Fix parameters in docstring for convert_to_codec_pipeline

* Revert change to pytest mark skipif for astropy

* Remove commented arguments

* Add classes to test_codecs and make zarr_array a fixture

* Add tests for extract_codecs

* Add test for get_codec_config

* Remove obsolete comment

* Add test for copy_and_replace_metadata

* Add release notes

* Attempt to fix rst links

* Move convert_v3_to_v2_metadata to utils

---------

Co-authored-by: Tom Nicholas <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internals zarr-python Relevant to zarr-python upstream

Projects

Development

Successfully merging this pull request may close these issues.

ManifestArray should use zarr-python's ArrayV3Metadata

6 participants