Skip to content

Commit 7885fbd

Browse files
author
Martin Durant
committed
Merge branch 'master' into fsspec
2 parents 8419dfd + 5bf61ef commit 7885fbd

File tree

6 files changed

+33
-20
lines changed

6 files changed

+33
-20
lines changed

docs/index.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
Zarr
55
====
66

7-
Zarr is a Python package providing an implementation of chunked,
8-
compressed, N-dimensional arrays.
7+
Zarr is a format for the storage of chunked, compressed, N-dimensional arrays.
8+
These documents describe the Zarr format and its Python implementation.
99

1010
Highlights
1111
----------
@@ -40,16 +40,16 @@ Install Zarr from PyPI::
4040
Alternatively, install Zarr via conda::
4141

4242
$ conda install -c conda-forge zarr
43-
44-
To install the latest development version of Zarr, you can use pip with the
43+
44+
To install the latest development version of Zarr, you can use pip with the
4545
latest GitHub master::
4646

4747
$ pip install git+https://github.com/zarr-developers/zarr-python.git
4848

4949
To work with Zarr source code in development, install from GitHub::
5050

5151
$ git clone --recursive https://github.com/zarr-developers/zarr-python.git
52-
$ cd zarr
52+
$ cd zarr-python
5353
$ python setup.py install
5454

5555
To verify that Zarr has been fully installed, run the test suite::

docs/release.rst

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,18 @@ Next release
77

88
* Fix minor bug in `N5Store`.
99
By :user:`gsakkis`, :issue:`550`.
10+
1011
* Improve error message in Jupyter when trying to use the ``ipytree`` widget
1112
without ``ipytree`` installed.
12-
By :user:`Zain Patel <mzjp2>; :issue:`537`
13+
By :user:`Zain Patel <mzjp2>`; :issue:`537`
14+
1315
* Explicitly close stores during testing.
1416
By :user:`Elliott Sales de Andrade <QuLogic>`; :issue:`442`
1517

18+
* Improve consistency of terminology regarding arrays and datasets in the
19+
documentation.
20+
By :user:`Josh Moore <joshmoore>`; :issue:`571`.
21+
1622

1723
.. _release_2.4.0:
1824

docs/tutorial.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -863,13 +863,13 @@ Consolidating metadata
863863

864864
Since there is a significant overhead for every connection to a cloud object
865865
store such as S3, the pattern described in the previous section may incur
866-
significant latency while scanning the metadata of the dataset hierarchy, even
866+
significant latency while scanning the metadata of the array hierarchy, even
867867
though each individual metadata object is small. For cases such as these, once
868868
the data are static and can be regarded as read-only, at least for the
869-
metadata/structure of the dataset hierarchy, the many metadata objects can be
869+
metadata/structure of the array hierarchy, the many metadata objects can be
870870
consolidated into a single one via
871871
:func:`zarr.convenience.consolidate_metadata`. Doing this can greatly increase
872-
the speed of reading the dataset metadata, e.g.::
872+
the speed of reading the array metadata, e.g.::
873873

874874
>>> zarr.consolidate_metadata(store) # doctest: +SKIP
875875

@@ -886,7 +886,7 @@ backend storage.
886886

887887
Note that, the hierarchy could still be opened in the normal way and altered,
888888
causing the consolidated metadata to become out of sync with the real state of
889-
the dataset hierarchy. In this case,
889+
the array hierarchy. In this case,
890890
:func:`zarr.convenience.consolidate_metadata` would need to be called again.
891891

892892
To protect against consolidated metadata accidentally getting out of sync, the
@@ -930,8 +930,8 @@ copying a group named 'foo' from an HDF5 file to a Zarr group::
930930
└── baz (100,) int64
931931
>>> source.close()
932932

933-
If rather than copying a single group or dataset you would like to copy all
934-
groups and datasets, use :func:`zarr.convenience.copy_all`, e.g.::
933+
If rather than copying a single group or array you would like to copy all
934+
groups and arrays, use :func:`zarr.convenience.copy_all`, e.g.::
935935

936936
>>> source = h5py.File('data/example.h5', mode='r')
937937
>>> dest = zarr.open_group('data/example2.zarr', mode='w')
@@ -1004,7 +1004,7 @@ String arrays
10041004
There are several options for storing arrays of strings.
10051005

10061006
If your strings are all ASCII strings, and you know the maximum length of the string in
1007-
your dataset, then you can use an array with a fixed-length bytes dtype. E.g.::
1007+
your array, then you can use an array with a fixed-length bytes dtype. E.g.::
10081008

10091009
>>> z = zarr.zeros(10, dtype='S6')
10101010
>>> z

zarr/hierarchy.py

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -746,6 +746,9 @@ def require_groups(self, *names):
746746
def create_dataset(self, name, **kwargs):
747747
"""Create an array.
748748
749+
Arrays are known as "datasets" in HDF5 terminology. For compatibility
750+
with h5py, Zarr groups also implement the require_dataset() method.
751+
749752
Parameters
750753
----------
751754
name : string
@@ -819,8 +822,12 @@ def _create_dataset_nosync(self, name, data=None, **kwargs):
819822
return a
820823

821824
def require_dataset(self, name, shape, dtype=None, exact=False, **kwargs):
822-
"""Obtain an array, creating if it doesn't exist. Other `kwargs` are
823-
as per :func:`zarr.hierarchy.Group.create_dataset`.
825+
"""Obtain an array, creating if it doesn't exist.
826+
827+
Arrays are known as "datasets" in HDF5 terminology. For compatibility
828+
with h5py, Zarr groups also implement the create_dataset() method.
829+
830+
Other `kwargs` are as per :func:`zarr.hierarchy.Group.create_dataset`.
824831
825832
Parameters
826833
----------

zarr/storage.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1779,7 +1779,7 @@ def __init__(self, path, buffers=True, **kwargs):
17791779
import lmdb
17801780

17811781
# set default memory map size to something larger than the lmdb default, which is
1782-
# very likely to be too small for any moderate dataset (logic copied from zict)
1782+
# very likely to be too small for any moderate array (logic copied from zict)
17831783
map_size = (2**40 if sys.maxsize >= 2**32 else 2**28)
17841784
kwargs.setdefault('map_size', map_size)
17851785

@@ -2587,14 +2587,14 @@ class ConsolidatedMetadataStore(MutableMapping):
25872587
a single key.
25882588
25892589
The purpose of this class, is to be able to get all of the metadata for
2590-
a given dataset in a single read operation from the underlying storage.
2590+
a given array in a single read operation from the underlying storage.
25912591
See :func:`zarr.convenience.consolidate_metadata` for how to create this
25922592
single metadata key.
25932593
25942594
This class loads from the one key, and stores the data in a dict, so that
25952595
accessing the keys no longer requires operations on the backend store.
25962596
2597-
This class is read-only, and attempts to change the dataset metadata will
2597+
This class is read-only, and attempts to change the array metadata will
25982598
fail, but changing the data is possible. If the backend storage is changed
25992599
directly, then the metadata stored here could become obsolete, and
26002600
:func:`zarr.convenience.consolidate_metadata` should be called again and the class
@@ -2607,7 +2607,7 @@ class ConsolidatedMetadataStore(MutableMapping):
26072607
Parameters
26082608
----------
26092609
store: MutableMapping
2610-
Containing the zarr dataset.
2610+
Containing the zarr array.
26112611
metadata_key: str
26122612
The target in the store where all of the metadata are stored. We
26132613
assume JSON encoding.

zarr/util.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ def normalize_shape(shape):
5454

5555
def guess_chunks(shape, typesize):
5656
"""
57-
Guess an appropriate chunk layout for a dataset, given its shape and
57+
Guess an appropriate chunk layout for an array, given its shape and
5858
the size of each element in bytes. Will allocate chunks only as large
5959
as MAX_SIZE. Chunks are generally close to some power-of-2 fraction of
6060
each axis, slightly favoring bigger values for the last index.

0 commit comments

Comments
 (0)