Skip to content

Conversation

@normanrz
Copy link
Member

@normanrz normanrz commented Dec 31, 2024

Builds upon #2463

  • Deprecates AsyncArray.create and Array.create to Array._create including deprecation warnings in favor of zarr.create_array.
  • Adds AsyncArray._create and Array._create which now contain the code of the unprefixed methods. We should not use them for new code anymore and remove once other APIs have been deprecated and removed.

I added 2 new features to make create_array usable for all codec configurations I could think of:

  • Adds a dict-notation to the shards kwarg in create_array to specify the index location, e.g. shards={"shape": (32, 32), "index_location": "start"}
  • Adds an optional array_bytes_codec kwarg to create_array to override the array-to-bytes codec for v3 arrays, e.g. to specify the endianness.

d-v-b added 30 commits November 4, 2024 22:51
@normanrz normanrz marked this pull request as ready for review January 2, 2025 11:39
@normanrz normanrz requested a review from d-v-b January 2, 2025 11:42
shards: ShardsParam | None = None,
filters: FiltersParam | None = "auto",
compressors: CompressorsParam = "auto",
array_bytes_codec: ArrayBytesCodecParam = "auto",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this parameter is important, but I think we can do better with the name to express a bit more what it does, and detach the name from the type (which might change). Unfortunately the only ideas I have are array_serializer chunk_serializer but I'm sure there are better options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about just codec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me, codec is a category that includes ArrayBytesCodec and BytesBytesCodec; I think the ideal name would somehow convey that this entity is tasked with unpacking (or packing?) an array into bytes. chunk_<verb>+er seems like a good template, but I'm not sure what verbs fit well here. 🤔 chunk_serializer, chunk_packer, chunk_streamer... chunk_serializer seems the best to me but it's far from great.

Copy link
Member Author

@normanrz normanrz Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think chunk_serializer is fine.
Maybe even just serializer. That would be more in line with filters and compressors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to dropping chunk_ as long as we don't think people will be confused about the scope

Copy link
Contributor

@d-v-b d-v-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! But I do think we should try to improve the name of the array_bytes_codec kwarg. see #2607 (comment)

Copy link
Member

@jhamman jhamman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @normanrz

@normanrz
Copy link
Member Author

normanrz commented Jan 2, 2025

Merged into #2463.

@normanrz normanrz closed this Jan 2, 2025
@normanrz normanrz deleted the feat/no-array-create branch January 9, 2025 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants