Skip to content

Commit f1ab32a

Browse files
committed
doc: tutorial updates for v3
1 parent 4d663cc commit f1ab32a

File tree

1 file changed

+37
-50
lines changed

1 file changed

+37
-50
lines changed

docs/tutorial.rst

Lines changed: 37 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,13 @@ Zarr has several functions for creating arrays. For example::
1818
>>> import zarr
1919
>>> z = zarr.zeros((10000, 10000), chunks=(1000, 1000), dtype='i4')
2020
>>> z
21-
<zarr.Array (10000, 10000) int32>
21+
<Array memory://4344739840 shape=(10000, 10000) dtype=int32>
2222

2323
The code above creates a 2-dimensional array of 32-bit integers with 10000 rows
2424
and 10000 columns, divided into chunks where each chunk has 1000 rows and 1000
2525
columns (and so there will be 100 chunks in total).
2626

27-
For a complete list of array creation routines see the :mod:`zarr.creation`
27+
For a complete list of array creation routines see the :mod:`zarr.api.synchronous`
2828
module documentation.
2929

3030
.. _tutorial_array:
@@ -47,21 +47,21 @@ The contents of the array can be retrieved by slicing, which will load the
4747
requested region into memory as a NumPy array, e.g.::
4848

4949
>>> z[0, 0]
50-
0
50+
array(0, dtype=int32)
5151
>>> z[-1, -1]
52-
42
52+
array(42, dtype=int32)
5353
>>> z[0, :]
5454
array([ 0, 1, 2, ..., 9997, 9998, 9999], dtype=int32)
5555
>>> z[:, 0]
5656
array([ 0, 1, 2, ..., 9997, 9998, 9999], dtype=int32)
5757
>>> z[:]
5858
array([[ 0, 1, 2, ..., 9997, 9998, 9999],
59-
[ 1, 42, 42, ..., 42, 42, 42],
60-
[ 2, 42, 42, ..., 42, 42, 42],
61-
...,
62-
[9997, 42, 42, ..., 42, 42, 42],
63-
[9998, 42, 42, ..., 42, 42, 42],
64-
[9999, 42, 42, ..., 42, 42, 42]], dtype=int32)
59+
[ 1, 42, 42, ..., 42, 42, 42],
60+
[ 2, 42, 42, ..., 42, 42, 42],
61+
...,
62+
[9997, 42, 42, ..., 42, 42, 42],
63+
[9998, 42, 42, ..., 42, 42, 42],
64+
[9999, 42, 42, ..., 42, 42, 42]], dtype=int32)
6565

6666
.. _tutorial_persist:
6767

@@ -77,7 +77,7 @@ persistence of data between sessions. For example::
7777

7878
The array above will store its configuration metadata and all compressed chunk
7979
data in a directory called 'data/example.zarr' relative to the current working
80-
directory. The :func:`zarr.convenience.open` function provides a convenient way
80+
directory. The :func:`zarr.api.synchronous.open` function provides a convenient way
8181
to create a new persistent array or continue working with an existing
8282
array. Note that although the function is called "open", there is no need to
8383
close an array: data are automatically flushed to disk, and files are
@@ -98,11 +98,11 @@ Check that the data have been written and can be read again::
9898

9999
If you are just looking for a fast and convenient way to save NumPy arrays to
100100
disk then load back into memory later, the functions
101-
:func:`zarr.convenience.save` and :func:`zarr.convenience.load` may be
101+
:func:`zarr.api.synchronous.save` and :func:`zarr.api.synchronous.load` may be
102102
useful. E.g.::
103103

104104
>>> a = np.arange(10)
105-
>>> zarr.save('data/example.zarr', a)
105+
>>> zarr.save('data/example.zarr', a, mode='w')
106106
>>> zarr.load('data/example.zarr')
107107
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
108108

@@ -155,7 +155,7 @@ argument accepted by all array creation functions. For example::
155155
>>> from numcodecs import Blosc
156156
>>> compressor = Blosc(cname='zstd', clevel=3, shuffle=Blosc.BITSHUFFLE)
157157
>>> data = np.arange(100000000, dtype='i4').reshape(10000, 10000)
158-
>>> z = zarr.array(data, chunks=(1000, 1000), compressor=compressor)
158+
>>> z = zarr.array(data, chunks=(1000, 1000), compressor=compressor, zarr_format=2)
159159
>>> z.compressor
160160
Blosc(cname='zstd', clevel=3, shuffle=BITSHUFFLE, blocksize=0)
161161

@@ -193,7 +193,7 @@ libraries available within Blosc can be obtained via::
193193

194194
>>> from numcodecs import blosc
195195
>>> blosc.list_compressors()
196-
['blosclz', 'lz4', 'lz4hc', 'snappy', 'zlib', 'zstd']
196+
['blosclz', 'lz4', 'lz4hc', 'zlib', 'zstd']
197197

198198
In addition to Blosc, other compression libraries can also be used. For example,
199199
here is an array using Zstandard compression, level 1::
@@ -290,7 +290,7 @@ To create a group, use the :func:`zarr.group` function::
290290

291291
>>> root = zarr.group()
292292
>>> root
293-
<zarr.hierarchy.Group '/'>
293+
<Group memory://4640618752>
294294

295295
Groups have a similar API to the Group class from `h5py
296296
<https://www.h5py.org/>`_. For example, groups can contain other groups::
@@ -300,32 +300,30 @@ Groups have a similar API to the Group class from `h5py
300300

301301
Groups can also contain arrays, e.g.::
302302

303-
>>> z1 = bar.zeros('baz', shape=(10000, 10000), chunks=(1000, 1000), dtype='i4')
303+
>>> z1 = bar.zeros(name='baz', shape=(10000, 10000), chunks=(1000, 1000), dtype='i4')
304304
>>> z1
305-
<zarr.Array '/foo/bar/baz' (10000, 10000) int32>
305+
<Array memory://4640612800/foo/bar/baz shape=(10000, 10000) dtype=int32>
306306

307-
Arrays are known as "datasets" in HDF5 terminology. For compatibility with h5py,
308-
Zarr groups also implement the ``create_dataset()`` and ``require_dataset()``
309-
methods, e.g.::
307+
Arrays can also be created with the ``create_array()`` and ``require_array()`` methods, e.g.::
310308

311-
>>> z = bar.create_dataset('quux', shape=(10000, 10000), chunks=(1000, 1000), dtype='i4')
309+
>>> z = bar.create_array(name='quux', shape=(10000, 10000), chunks=(1000, 1000), dtype='i4')
312310
>>> z
313-
<zarr.Array '/foo/bar/quux' (10000, 10000) int32>
311+
<Array memory://4640612800/foo/bar/quux shape=(10000, 10000) dtype=int32>
314312

315313
Members of a group can be accessed via the suffix notation, e.g.::
316314

317315
>>> root['foo']
318-
<zarr.hierarchy.Group '/foo'>
316+
<Group memory://4640612800/foo
319317

320318
The '/' character can be used to access multiple levels of the hierarchy in one
321319
call, e.g.::
322320

323321
>>> root['foo/bar']
324-
<zarr.hierarchy.Group '/foo/bar'>
322+
<Group memory://4640612800/foo/bar>
325323
>>> root['foo/bar/baz']
326-
<zarr.Array '/foo/bar/baz' (10000, 10000) int32>
324+
<Array memory://4640612800/foo/bar/baz shape=(10000, 10000) dtype=int32>
327325

328-
The :func:`zarr.hierarchy.Group.tree` method can be used to print a tree
326+
The :func:`zarr.core.group.Group.tree` method can be used to print a tree
329327
representation of the hierarchy, e.g.::
330328

331329
>>> root.tree()
@@ -335,16 +333,16 @@ representation of the hierarchy, e.g.::
335333
├── baz (10000, 10000) int32
336334
└── quux (10000, 10000) int32
337335

338-
The :func:`zarr.convenience.open` function provides a convenient way to create or
336+
The :func:`zarr.api.asynchronous.open` function provides a convenient way to create or
339337
re-open a group stored in a directory on the file-system, with sub-groups stored in
340338
sub-directories, e.g.::
341339

342340
>>> root = zarr.open('data/group.zarr', mode='w')
343341
>>> root
344-
<zarr.hierarchy.Group '/'>
345-
>>> z = root.zeros('foo/bar/baz', shape=(10000, 10000), chunks=(1000, 1000), dtype='i4')
342+
<Group file://data/group.zarr>
343+
>>> z = root.zeros(name='foo/bar/baz', shape=(10000, 10000), chunks=(1000, 1000), dtype='i4')
346344
>>> z
347-
<zarr.Array '/foo/bar/baz' (10000, 10000) int32>
345+
<Array file://data/group.zarr/foo/bar/baz shape=(10000, 10000) dtype=int32>
348346

349347
Groups can be used as context managers (in a ``with`` statement).
350348
If the underlying store has a ``close`` method, it will be called on exit.
@@ -362,9 +360,9 @@ property. E.g.::
362360

363361
>>> root = zarr.group()
364362
>>> foo = root.create_group('foo')
365-
>>> bar = foo.zeros('bar', shape=1000000, chunks=100000, dtype='i8')
363+
>>> bar = foo.zeros(name='bar', shape=1000000, chunks=100000, dtype='i8')
366364
>>> bar[:] = 42
367-
>>> baz = foo.zeros('baz', shape=(1000, 1000), chunks=(100, 100), dtype='f4')
365+
>>> baz = foo.zeros(name='baz', shape=(1000, 1000), chunks=(100, 100), dtype='f4')
368366
>>> baz[:] = 4.2
369367
>>> root.info
370368
Name : /
@@ -416,7 +414,7 @@ property. E.g.::
416414
Storage ratio : 167.1
417415
Chunks initialized : 100/100
418416

419-
Groups also have the :func:`zarr.hierarchy.Group.tree` method, e.g.::
417+
Groups also have the :func:`zarr.core.group.Group.tree` method, e.g.::
420418

421419
>>> root.tree()
422420
/
@@ -440,7 +438,7 @@ storing application-specific metadata. For example::
440438

441439
>>> root = zarr.group()
442440
>>> root.attrs['foo'] = 'bar'
443-
>>> z = root.zeros('zzz', shape=(10000, 10000))
441+
>>> z = root.zeros(name='zzz', shape=(10000, 10000))
444442
>>> z.attrs['baz'] = 42
445443
>>> z.attrs['qux'] = [1, 4, 7, 12]
446444
>>> sorted(root.attrs)
@@ -638,7 +636,7 @@ If the index contains at most one iterable, and otherwise contains only slices a
638636
orthogonal indexing is also available directly on the array:
639637

640638
>>> z = zarr.array(np.arange(15).reshape(3, 5))
641-
>>> all(z.oindex[[0, 2], :] == z[[0, 2], :])
639+
>>> np.all(z.oindex[[0, 2], :] == z[[0, 2], :])
642640
True
643641

644642
Block Indexing
@@ -649,8 +647,6 @@ selections of whole chunks based on their logical indices along each dimension
649647
of an array. For example, this allows selecting a subset of chunk aligned rows and/or
650648
columns from a 2-dimensional array. E.g.::
651649

652-
>>> import zarr
653-
>>> import numpy as np
654650
>>> z = zarr.array(np.arange(100).reshape(10, 10), chunks=(3, 3))
655651

656652
Retrieve items by specifying their block coordinates::
@@ -686,8 +682,6 @@ For example::
686682

687683
Data can also be modified. Let's start by a simple 2D array::
688684

689-
>>> import zarr
690-
>>> import numpy as np
691685
>>> z = zarr.zeros((6, 6), dtype=int, chunks=2)
692686

693687
Set data for a selection of items::
@@ -874,7 +868,6 @@ can be used with Zarr.
874868
Here is an example using S3Map to read an array created previously::
875869

876870
>>> import s3fs
877-
>>> import zarr
878871
>>> s3 = s3fs.S3FileSystem(anon=True, client_kwargs=dict(region_name='eu-west-2'))
879872
>>> store = s3fs.S3Map(root='zarr-demo/store', s3=s3, check=False)
880873
>>> root = zarr.group(store=store)
@@ -1071,8 +1064,6 @@ into a Zarr group, or vice-versa, the :func:`zarr.convenience.copy` and
10711064
copying a group named 'foo' from an HDF5 file to a Zarr group::
10721065

10731066
>>> import h5py
1074-
>>> import zarr
1075-
>>> import numpy as np
10761067
>>> source = h5py.File('data/example.h5', mode='w')
10771068
>>> foo = source.create_group('foo')
10781069
>>> baz = foo.create_dataset('bar/baz', data=np.arange(100), chunks=(50,))
@@ -1125,8 +1116,6 @@ the :func:`zarr.convenience.copy_store` function can be used. This function
11251116
copies data directly between the underlying stores, without any decompression or
11261117
re-compression, and so should be faster. E.g.::
11271118

1128-
>>> import zarr
1129-
>>> import numpy as np
11301119
>>> store1 = zarr.DirectoryStore('data/example.zarr')
11311120
>>> root = zarr.group(store1, overwrite=True)
11321121
>>> baz = root.create_dataset('foo/bar/baz', data=np.arange(100), chunks=(50,))
@@ -1176,7 +1165,7 @@ your array, then you can use an array with a fixed-length bytes dtype. E.g.::
11761165

11771166
>>> z = zarr.zeros(10, dtype='S6')
11781167
>>> z
1179-
<zarr.Array (10,) |S6>
1168+
<Array memory://4645496064 shape=(10,) dtype=object>
11801169
>>> z[0] = b'Hello'
11811170
>>> z[1] = b'world!'
11821171
>>> z[:]
@@ -1447,8 +1436,6 @@ In this case, creating an array with ``write_empty_chunks=True`` (the default) w
14471436
The following example illustrates the effect of the ``write_empty_chunks`` flag on
14481437
the time required to write an array with different values.::
14491438

1450-
>>> import zarr
1451-
>>> import numpy as np
14521439
>>> import time
14531440
>>> from tempfile import TemporaryDirectory
14541441
>>> def timed_write(write_empty_chunks):
@@ -1655,9 +1642,9 @@ Datetimes and timedeltas
16551642
NumPy's ``datetime64`` ('M8') and ``timedelta64`` ('m8') dtypes are supported for Zarr
16561643
arrays, as long as the units are specified. E.g.::
16571644

1658-
>>> z = zarr.array(['2007-07-13', '2006-01-13', '2010-08-13'], dtype='M8[D]')
1645+
>>> z = zarr.array(['2007-07-13', '2006-01-13', '2010-08-13'], dtype='M8[D]', zarr_format=2)
16591646
>>> z
1660-
<zarr.Array (3,) datetime64[D]>
1647+
<Array memory://4686989376 shape=(3,) dtype=datetime64[D]>
16611648
>>> z[:]
16621649
array(['2007-07-13', '2006-01-13', '2010-08-13'], dtype='datetime64[D]')
16631650
>>> z[0]

0 commit comments

Comments
 (0)