Skip to content
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
111 commits
Select commit Hold shift + click to select a range
fd6ecd1
add functions for easy read-only data access
d-v-b Nov 4, 2024
fa343f5
sync funcs
d-v-b Nov 4, 2024
d95eba8
make read-only funcs top-level exports
d-v-b Nov 4, 2024
f6765bc
Merge branch 'main' into feat/read-funcs
d-v-b Nov 5, 2024
5d8445b
add create_array, create_group, and tests
d-v-b Nov 5, 2024
9526571
add top-level imports
d-v-b Nov 5, 2024
90bf421
Merge branch 'feat/read-funcs' of github.com:d-v-b/zarr-python into f…
d-v-b Nov 5, 2024
de280a7
add test for top-level exports
d-v-b Nov 5, 2024
d9878cf
add test for read
d-v-b Nov 5, 2024
e5217ce
add asserts
d-v-b Nov 5, 2024
40cc7af
Apply suggestions from code review
d-v-b Nov 5, 2024
d7ce58b
Merge branch 'main' into feat/read-funcs
d-v-b Nov 5, 2024
a0dfe18
Merge branch 'main' into feat/read-funcs
d-v-b Nov 5, 2024
98bc328
Merge branch 'main' into feat/read-funcs
d-v-b Nov 12, 2024
16f5cc2
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b Nov 29, 2024
4b45ebf
handle sharding in create_array
d-v-b Dec 10, 2024
750a439
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b Dec 10, 2024
7a5cbe7
tweak
d-v-b Dec 10, 2024
215ff96
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b Dec 18, 2024
489e2a2
make logic of _auto_partition better for shard shape
d-v-b Dec 18, 2024
05dd0d8
add dtype parsing, and tweak auto_partitioning func
d-v-b Dec 18, 2024
3fbfc21
sketch of docstring; remove auto chunks / shard shape
d-v-b Dec 19, 2024
b348737
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b Dec 19, 2024
5025ad6
tweak docstring
d-v-b Dec 19, 2024
e204a32
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b Dec 20, 2024
68465db
docstrings
d-v-b Dec 20, 2024
d7bb121
ensure tests pass
d-v-b Dec 20, 2024
99cc8f5
tuple -> list
d-v-b Dec 20, 2024
a39457f
allow data in create_array
d-v-b Dec 20, 2024
3f0a3e0
docstring
d-v-b Dec 20, 2024
26ced00
remove auto_partition
d-v-b Dec 20, 2024
af55ac4
make shape shapelike
d-v-b Dec 20, 2024
07f07ea
use create_array everywhere in group class
d-v-b Dec 20, 2024
bc552ce
remove readers
d-v-b Dec 20, 2024
74f731a
fix dodgy imports
d-v-b Dec 20, 2024
43877c0
compressors -> compression, auto chunking, auto sharding, auto compre…
d-v-b Dec 21, 2024
c693fb4
use sane shard shape when there are too few chunks
d-v-b Dec 21, 2024
4c18aaa
Merge branch 'main' into feat/read-funcs
jhamman Dec 22, 2024
dba2594
fix: allow user-specified filters and compression
d-v-b Dec 22, 2024
669ad72
np.dtype[np.generic] -> np.dtype[Any]
d-v-b Dec 22, 2024
ae1832d
handle singleton compressor / filters input
d-v-b Dec 22, 2024
5cb6dd8
default codec config now uses the full config dict
normanrz Dec 22, 2024
5dcd80b
test for auto sharding
d-v-b Dec 23, 2024
810ff9b
Merge branch 'feat/default-codecs' into feat/read-funcs
normanrz Dec 25, 2024
eab46a2
test
normanrz Dec 25, 2024
bcdc4cc
adds a shards property
normanrz Dec 25, 2024
4e978f9
add (typed) functions for resolving codecs
d-v-b Dec 26, 2024
a9850bf
better codec parsing
d-v-b Dec 26, 2024
2747d69
add warning if auto sharding is used
d-v-b Dec 26, 2024
023c16b
remove read_array
d-v-b Dec 26, 2024
de2c36e
rename compression to compressors, and make the docstring for create_…
d-v-b Dec 26, 2024
74d31ef
compression -> compressors, shard_shape -> shards, chunk_shape -> chunks
d-v-b Dec 26, 2024
470b60f
use typerror instead of valuerror; docstring
d-v-b Dec 27, 2024
e8b1ad1
default order is None
d-v-b Dec 27, 2024
b919483
Merge branch 'feat/chunks-shards' into feat/read-funcs
normanrz Dec 27, 2024
6fcd976
fix circular dep
normanrz Dec 27, 2024
d9c30a3
format
normanrz Dec 27, 2024
0bf4dd0
fix some tests
normanrz Dec 27, 2024
ea3ed0e
use filters=auto and compressors=auto in Group.create_array
normanrz Dec 27, 2024
54fd920
compression -> compressors
normanrz Dec 27, 2024
a4ba7db
Update src/zarr/core/group.py
d-v-b Dec 28, 2024
fb286a7
fix mypy
normanrz Dec 28, 2024
df35d13
narrow type of filters param and compression param
d-v-b Dec 28, 2024
80b5a10
Merge branch 'feat/read-funcs' of github.com:d-v-b/zarr-python into f…
d-v-b Dec 28, 2024
77f40a5
remove data kwarg to create_array
d-v-b Dec 28, 2024
235e246
mypy fixes
normanrz Dec 28, 2024
95348d6
ensure that we accept dict form of compressor in _parse_chunk_encodin…
d-v-b Dec 28, 2024
91a7916
Merge branch 'feat/read-funcs' of github.com:d-v-b/zarr-python into f…
d-v-b Dec 28, 2024
665037e
fix properties test
normanrz Dec 28, 2024
ae76bb3
Merge branch 'feat/read-funcs' of github.com:d-v-b/zarr-python into f…
normanrz Dec 28, 2024
0a983e6
add tests for compressors and filters kwargs to create_array
d-v-b Dec 28, 2024
2182793
add tests for codec inference
d-v-b Dec 28, 2024
c04d7cf
add test for illegal shards kwarg for v2 arrays
d-v-b Dec 28, 2024
144b2b7
remove redundant test function
d-v-b Dec 28, 2024
d407e5d
tests and types
normanrz Dec 29, 2024
1301c5f
rm print
normanrz Dec 29, 2024
31b3ad4
types
normanrz Dec 29, 2024
a0c1c95
merge
normanrz Dec 29, 2024
43b6774
resolve cyclic import
normanrz Dec 29, 2024
e55023a
add create_array to async and sync API
normanrz Dec 30, 2024
e24bdeb
docs for create_array
normanrz Dec 30, 2024
b564ae6
rename (Async)Array.create to _create
normanrz Dec 31, 2024
75b2197
adds array_bytes_codec kwarg
normanrz Dec 31, 2024
2f6f8a0
tests
normanrz Dec 31, 2024
c4330ef
tests for no filters+compressors
normanrz Dec 31, 2024
f926a5a
Merge branch 'main' into feat/read-funcs
d-v-b Jan 1, 2025
95ffadd
widen type of FiltersParam to include single numcodecs codec instances
d-v-b Jan 1, 2025
bbe3a94
don't alias None to default codecs in _create_v2
d-v-b Jan 1, 2025
856b40f
allow single codec instances for filters, and None for filters / comp…
d-v-b Jan 1, 2025
2aa3acc
add docstring for None
normanrz Jan 2, 2025
9fb8a33
single-item tuple for compressors in v2
normanrz Jan 2, 2025
99faa8e
Update src/zarr/core/array.py
normanrz Jan 2, 2025
108daa0
Merge branch 'main' into feat/read-funcs
normanrz Jan 2, 2025
305fdb7
merge
normanrz Jan 2, 2025
947f20e
tweaks
normanrz Jan 2, 2025
2d2af8f
Merge branch 'main' into feat/no-array-create
normanrz Jan 2, 2025
14c45cd
pr feedback 1
normanrz Jan 2, 2025
ff5e5cb
Merge branch 'feat/no-array-create' of github.com:zarr-developers/zar…
normanrz Jan 2, 2025
2afe940
tests
normanrz Jan 2, 2025
e3f1f33
mypy
normanrz Jan 2, 2025
1643983
rename array_bytes_codec to serializer
normanrz Jan 2, 2025
aad8e9d
Update src/zarr/api/asynchronous.py
d-v-b Jan 2, 2025
f29b2d9
docstrings
d-v-b Jan 2, 2025
4654cbd
*params -> *like
d-v-b Jan 2, 2025
5cdb515
*params -> *like, in tests
d-v-b Jan 2, 2025
0dc7dc6
merge
normanrz Jan 2, 2025
c5c761e
Merge remote-tracking branch 'origin/main' into feat/read-funcs
normanrz Jan 2, 2025
be60d73
adds deprecated compressor arg to Group.create_array
normanrz Jan 2, 2025
ae1aa2a
Merge remote-tracking branch 'origin/main' into feat/read-funcs
normanrz Jan 2, 2025
0a8b91c
docs
normanrz Jan 2, 2025
315ba88
Merge branch 'main' into feat/read-funcs
jhamman Jan 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions src/zarr/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
copy_all,
copy_store,
create,
create_array,
create_group,
empty,
empty_like,
full,
Expand All @@ -19,6 +21,9 @@
open_consolidated,
open_group,
open_like,
read,
read_array,
read_group,
save,
save_array,
save_group,
Expand Down Expand Up @@ -46,6 +51,9 @@
"copy_all",
"copy_store",
"create",
"create_array",
"create_group",
"read_array",
"empty",
"empty_like",
"full",
Expand All @@ -55,9 +63,11 @@
"ones",
"ones_like",
"open",
"read",
"open_array",
"open_consolidated",
"open_group",
"read_group",
"open_like",
"save",
"save_array",
Expand Down
311 changes: 309 additions & 2 deletions src/zarr/api/asynchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,8 +275,8 @@ async def open(
path : str or None, optional
The path within the store to open.
storage_options : dict
If using an fsspec URL to create the store, these will be passed to
the backend implementation. Ignored otherwise.
If the store is backed by an fsspec-based implementation, then this dict will be passed to
the Store constructor for that implementation. Ignored otherwise.
**kwargs
Additional parameters are passed through to :func:`zarr.creation.open_array` or
:func:`zarr.hierarchy.open_group`.
Expand Down Expand Up @@ -313,6 +313,47 @@ async def open(
return await open_group(store=store_path, zarr_format=zarr_format, **kwargs)


async def read(
*,
store: StoreLike | None = None,
zarr_format: ZarrFormat | None = None,
path: str | None = None,
storage_options: dict[str, Any] | None = None,
**kwargs: Any,
) -> AsyncArray[ArrayV2Metadata] | AsyncArray[ArrayV3Metadata] | AsyncGroup:
"""Convenience function to open a group or array for reading. This function
wraps :func:`zarr.api.asynchronous.open` See the documentation of that function for details.

Parameters
----------
store : Store or str, optional
Store or path to directory in file system or name of zip file.
zarr_format : {2, 3, None}, optional
The zarr format to require. The default value of None will first look for Zarr v3 data,
then Zarr v2 data, then fail if neither format is found.
path : str or None, optional
The path within the store to open.
storage_options : dict, optional
If using an fsspec URL to create the store, this will be passed to
the backend implementation. Ignored otherwise.
**kwargs
Additional parameters are passed through to :func:`zarr.creation.open`.

Returns
-------
z : array or group
Return type depends on what exists in the given store.
"""
return await open(
store=store,
mode="r",
zarr_format=zarr_format,
path=path,
storage_options=storage_options,
**kwargs,
)


async def open_consolidated(
*args: Any, use_consolidated: Literal[True] = True, **kwargs: Any
) -> AsyncGroup:
Expand Down Expand Up @@ -615,6 +656,54 @@ async def group(
)


async def create_group(
*,
store: StoreLike,
path: str | None = None,
overwrite: bool = False,
zarr_format: ZarrFormat | None = None,
attributes: dict[str, Any] | None = None,
storage_options: dict[str, Any] | None = None,
) -> AsyncGroup:
"""Create a group.

Parameters
----------
store : Store or str
Store or path to directory in file system.
path : str, optional
Group path within store.
overwrite : bool, optional
If True, pre-existing data at ``path`` will be deleted before
creating the group.
zarr_format : {2, 3, None}, optional
The zarr format to use when saving.
storage_options : dict
If using an fsspec URL to create the store, these will be passed to
the backend implementation. Ignored otherwise.

Returns
-------
g : group
The new group.
"""

if zarr_format is None:
zarr_format = _default_zarr_version()

# TODO: fix this when modes make sense. It should be `w` for overwriting, `w-` otherwise
mode: Literal["a"] = "a"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What isn't working here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the TODO as its from an era before some store mode refactoring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-v-b Did you push your change?


store_path = await make_store_path(store, path=path, mode=mode, storage_options=storage_options)

return await AsyncGroup.from_store(
store=store_path,
zarr_format=zarr_format,
exists_ok=overwrite,
attributes=attributes,
)


async def open_group(
store: StoreLike | None = None,
*, # Note: this is a change from v2
Expand Down Expand Up @@ -721,6 +810,188 @@ async def open_group(
)


async def read_group(
store: StoreLike,
*,
path: str | None = None,
zarr_format: ZarrFormat | None = None,
storage_options: dict[str, Any] | None = None,
use_consolidated: bool | str | None = None,
) -> AsyncGroup:
"""Open a group for reading. This function wraps :func:`zarr.api.asynchronous.open_group` See
the documentation of that function for details.

Parameters
----------
store : Store, str, or mapping, optional
Store or path to directory in file system or name of zip file.

Strings are interpreted as paths on the local file system
and used as the ``root`` argument to :class:`zarr.store.LocalStore`.

Dictionaries are used as the ``store_dict`` argument in
:class:`zarr.store.MemoryStore``.
path : str, optional
Group path within store.
zarr_format : {2, 3, None}, optional
The zarr format to require. The default value of None will first look for Zarr v3 data,
then Zarr v2 data, then fail if neither format is found.
storage_options : dict
If the store is backed by an fsspec-based implementation, then this dict will be passed to
the Store constructor for that implementation. Ignored otherwise.
use_consolidated : bool or str, default None
Whether to use consolidated metadata.

By default, consolidated metadata is used if it's present in the
store (in the ``zarr.json`` for Zarr v3 and in the ``.zmetadata`` file
for Zarr v2).

To explicitly require consolidated metadata, set ``use_consolidated=True``,
which will raise an exception if consolidated metadata is not found.

To explicitly *not* use consolidated metadata, set ``use_consolidated=False``,
which will fall back to using the regular, non consolidated metadata.

Zarr v2 allowed configuring the key storing the consolidated metadata
(``.zmetadata`` by default). Specify the custom key as ``use_consolidated``
to load consolidated metadata from a non-default key.

Returns
-------
g : group
The new group.
"""
return await open_group(
store=store,
mode="r",
path=path,
storage_options=storage_options,
zarr_format=zarr_format,
use_consolidated=use_consolidated,
)


async def create_array(
store: str | StoreLike,
*,
shape: ChunkCoords,
chunks: ChunkCoords | None = None, # TODO: v2 allowed chunks=True
dtype: npt.DTypeLike | None = None,
compressor: dict[str, JSON] | None = None, # TODO: default and type change
fill_value: Any | None = 0, # TODO: need type
order: MemoryOrder | None = None,
overwrite: bool = False,
path: PathLike | None = None,
filters: list[dict[str, JSON]] | None = None, # TODO: type has changed
dimension_separator: Literal[".", "/"] | None = None,
zarr_format: ZarrFormat | None = None,
attributes: dict[str, JSON] | None = None,
# v3 only
chunk_shape: ChunkCoords | None = None,
chunk_key_encoding: (
ChunkKeyEncoding
| tuple[Literal["default"], Literal[".", "/"]]
| tuple[Literal["v2"], Literal[".", "/"]]
| None
) = None,
codecs: Iterable[Codec | dict[str, JSON]] | None = None,
dimension_names: Iterable[str] | None = None,
storage_options: dict[str, Any] | None = None,
**kwargs: Any,
) -> AsyncArray[ArrayV2Metadata] | AsyncArray[ArrayV3Metadata]:
"""Create an array.

Parameters
----------
shape : int or tuple of ints
Array shape.
chunks : int or tuple of ints, optional
Chunk shape. If True, will be guessed from `shape` and `dtype`. If
False, will be set to `shape`, i.e., single chunk for the whole array.
If an int, the chunk size in each dimension will be given by the value
of `chunks`. Default is True.
dtype : str or dtype, optional
NumPy dtype.
compressor : Codec, optional
Primary compressor.
fill_value : object
Default value to use for uninitialized portions of the array.
order : {'C', 'F'}, optional
Memory layout to be used within each chunk.
Default is set in Zarr's config (`array.order`).
store : Store or str
Store or path to directory in file system or name of zip file.
overwrite : bool, optional
If True, delete all pre-existing data in `store` at `path` before
creating the array.
path : str, optional
Path under which array is stored.
filters : sequence of Codecs, optional
Sequence of filters to use to encode chunk data prior to compression.
dimension_separator : {'.', '/'}, optional
Separator placed between the dimensions of a chunk.
zarr_format : {2, 3, None}, optional
The zarr format to use when saving.
storage_options : dict
If using an fsspec URL to create the store, these will be passed to
the backend implementation. Ignored otherwise.

Returns
-------
z : array
The array.
"""

if zarr_format is None:
zarr_format = _default_zarr_version()

if zarr_format == 2 and chunks is None:
chunks = shape
elif zarr_format == 3 and chunk_shape is None:
if chunks is not None:
chunk_shape = chunks
chunks = None
else:
chunk_shape = shape

if dimension_separator is not None:
if zarr_format == 3:
raise ValueError(
"dimension_separator is not supported for zarr format 3, use chunk_key_encoding instead"
)
else:
warnings.warn(
"dimension_separator is not yet implemented",
RuntimeWarning,
stacklevel=2,
)

# TODO: fix this when modes make sense. It should be `w` for overwriting, `w-` otherwise
mode: Literal["a"] = "a"

store_path = await make_store_path(store, path=path, mode=mode, storage_options=storage_options)

return await AsyncArray.create(
store_path,
shape=shape,
chunks=chunks,
dtype=dtype,
compressor=compressor,
fill_value=fill_value,
exists_ok=overwrite,
filters=filters,
dimension_separator=dimension_separator,
zarr_format=zarr_format,
chunk_shape=chunk_shape,
chunk_key_encoding=chunk_key_encoding,
codecs=codecs,
dimension_names=dimension_names,
attributes=attributes,
order=order,
**kwargs,
)


async def create(
shape: ChunkCoords,
*, # Note: this is a change from v2
Expand Down Expand Up @@ -904,6 +1175,41 @@ async def create(
)


async def read_array(
store: StoreLike,
*,
path: str | None = None,
zarr_format: ZarrFormat | None = None,
storage_options: dict[str, Any] | None = None,
) -> AsyncArray[ArrayV3Metadata] | AsyncArray[ArrayV2Metadata]:
"""Create an array for reading. Wraps `:func:zarr.api.asynchronous.create`.
See the documentation of that function for details.

Parameters
----------
store : Store or str
Store or path to directory in file system or name of zip file.
path : str, optional
Path under which the array is stored.
zarr_format : {2, 3, None}, optional
The zarr format to require. The default value of ``None`` will first look for Zarr v3 data,
then Zarr v2 data, then fail if neither format is found.
storage_options : dict
If using an fsspec URL to create the store, these will be passed to
the backend implementation. Ignored otherwise.

Returns
-------
z : array
The array.
"""
store_path = await make_store_path(store, path=path, mode="r", storage_options=storage_options)
return await AsyncArray.open(
store=store_path,
zarr_format=zarr_format,
)


async def empty(
shape: ChunkCoords, **kwargs: Any
) -> AsyncArray[ArrayV2Metadata] | AsyncArray[ArrayV3Metadata]:
Expand Down Expand Up @@ -1081,6 +1387,7 @@ async def open_array(
store=store_path,
zarr_format=zarr_format or _default_zarr_version(),
overwrite=store_path.store.mode.overwrite,
storage_options=storage_options,
**kwargs,
)
raise
Expand Down
Loading