@@ -778,9 +778,11 @@ chunk size, which will reduce the number of chunks and thus reduce the number of
778
778
round-trips required to retrieve data for an array (and thus reduce the impact of network
779
779
latency). Another option is to try to increase the compression ratio by changing
780
780
compression options or trying a different compressor (which will reduce the impact of
781
- limited network bandwidth). As of version 2.2, Zarr also provides the
782
- :class: `zarr.storage.LRUStoreCache ` which can be used to implement a local in-memory cache
783
- layer over a remote store. E.g.::
781
+ limited network bandwidth).
782
+
783
+ As of version 2.2, Zarr also provides the :class: `zarr.storage.LRUStoreCache `
784
+ which can be used to implement a local in-memory cache layer over a remote
785
+ store. E.g.::
784
786
785
787
>>> s3 = s3fs.S3FileSystem(anon=True, client_kwargs=dict(region_name='eu-west-2'))
786
788
>>> store = s3fs.S3Map(root='zarr-demo/store', s3=s3, check=False)
@@ -797,10 +799,10 @@ layer over a remote store. E.g.::
797
799
b'Hello from the cloud!'
798
800
0.0009490990014455747
799
801
800
- If you are still experiencing poor performance with distributed/cloud storage, please
801
- raise an issue on the GitHub issue tracker with any profiling data you can provide, as
802
- there may be opportunities to optimise further either within Zarr or within the mapping
803
- interface to the storage.
802
+ If you are still experiencing poor performance with distributed/cloud storage,
803
+ please raise an issue on the GitHub issue tracker with any profiling data you
804
+ can provide, as there may be opportunities to optimise further either within
805
+ Zarr or within the mapping interface to the storage.
804
806
805
807
.. _tutorial_copy :
806
808
@@ -809,27 +811,38 @@ Consolidating metadata
809
811
810
812
(This is an experimental feature.)
811
813
812
- Since there is a significant overhead for every connection to s3, the pattern described in
813
- the previous section may incur significant latency while scanning the metadata of the data-set
814
- hierarchy, even though each individual file is small. For cases such as these, once the file
815
- is static and can be regarded as read-only, at least for the metadata/structure of the
816
- data-set, the many metadata files can be consolidated into a single one.
817
- Doing this can greatly increase the speed of reading the data-set hierarchy::
814
+ Since there is a significant overhead for every connection to a cloud object
815
+ store such as S3, the pattern described in the previous section may incur
816
+ significant latency while scanning the metadata of the dataset hierarchy, even
817
+ though each individual metadata object is small. For cases such as these, once
818
+ the data are static and can be regarded as read-only, at least for the
819
+ metadata/structure of the dataset hierarchy, the many metadata objects can be
820
+ consolidated into a single one via
821
+ :func: `zarr.convenience.consolidate_metadata `. Doing this can greatly increase
822
+ the speed of reading the dataset metadata, e.g.::
823
+
824
+ >>> zarr.consolidate_metadata(store) # doctest: +SKIP
825
+
826
+ This creates a special key with a copy of all of the metadata from all of the
827
+ metadata objects in the store.
818
828
819
- >>> zarr.consolidate_metadata(store)
829
+ Later, to open a Zarr store with consolidated metadata, use
830
+ :func: `zarr.convenience.open_consolidated `, e.g.::
820
831
821
- Creates a special key with a copy of all of the metadata from the many files.
822
- Later::
832
+ >>> root = zarr.open_consolidated(store) # doctest: +SKIP
823
833
824
- >>> root = zarr.open_consolidated(store)
834
+ This uses the special key to read all of the metadata in a single call to the
835
+ backend storage.
825
836
826
- Uses this special key to read all of the metadata in a single call to the backend storage.
837
+ Note that, the hierarchy could still be opened in the normal way and altered,
838
+ causing the consolidated metadata to become out of sync with the real state of
839
+ the dataset hierarchy. In this case,
840
+ :func: `zarr.convenience.consolidate_metadata ` would need to be called again.
827
841
828
- Note that, the data-set could still be opened in the normal way and altered, causing the
829
- consolidated metadata to become out of sync with the real state of the data-set. In this
830
- case, :func: `zarr.consolidate_metadata ` would need to be called again. The data-set
831
- returned by :func: `zarr.open_consolidated ` is read-only for the metadata, but the data
832
- values can still be updated.
842
+ To protect against consolidated metadata accidentally getting out of sync, the
843
+ root group returned by :func: `zarr.convenience.open_consolidated ` is read-only
844
+ for the metadata, meaning that no new groups or arrays can be created, and
845
+ arrays cannot be resized. However, data values with arrays can still be updated.
833
846
834
847
Copying/migrating data
835
848
----------------------
0 commit comments