Skip to content

Commit b4a05ba

Browse files
committed
Merge branch 'docs/dtype-docs' of github.com:d-v-b/zarr-python into docs/dtype-docs
2 parents 432d975 + 0c20603 commit b4a05ba

File tree

11 files changed

+135
-242
lines changed

11 files changed

+135
-242
lines changed

docs/conf.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,14 +38,14 @@
3838
extensions = [
3939
"sphinx.ext.autodoc",
4040
"sphinx.ext.autosummary",
41-
"sphinx.ext.viewcode",
4241
"sphinx.ext.intersphinx",
4342
'autoapi.extension',
4443
"numpydoc",
4544
"sphinx_issues",
4645
"sphinx_copybutton",
4746
"sphinx_design",
4847
'sphinx_reredirects',
48+
"sphinx.ext.viewcode",
4949
]
5050

5151
issues_github_path = "zarr-developers/zarr-python"
@@ -124,7 +124,7 @@ def skip_submodules(
124124
# List of patterns, relative to source directory, that match files and
125125
# directories to ignore when looking for source files.
126126
# This patterns also effect to html_static_path and html_extra_path
127-
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "talks", "api"]
127+
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "talks"]
128128

129129
# The reST default role (used for this markup: `text`) to use for all
130130
# documents.

docs/user-guide/data_types.rst

Lines changed: 39 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Array Data Types
1+
Array data types
22
================
33

44
Zarr's Data Type Model
@@ -74,7 +74,7 @@ V2 JSON identifier for a data type is just the NumPy ``str`` attribute of that d
7474
However, Zarr version 3 data types do not store endianness information.
7575

7676
There are two special cases to consider: `"structured" data types <#structured-data-type>`_, and
77-
`"object" <#object-data-type>` data types.
77+
`"object" <#object-data-type>`_ data types.
7878

7979
Structured Data Type
8080
^^^^^^^^^^^^^^^^^^^^
@@ -118,8 +118,7 @@ objects has a consistent type, then we can use a special encoding procedure to s
118118
is how Zarr Python stores variable-length UTF-8 strings, or variable-length byte strings.
119119

120120
Although these are separate data types in this library, they are both "object" arrays in NumPy, which means
121-
they have the same Zarr V2 string representation: ``"|O"``. Clearly in this case the string
122-
representation of the data type is ambiguous in this case.
121+
they have the *same* Zarr V2 string representation: ``"|O"``.
123122

124123
So for Zarr V2 we have to disambiguate different "object" data type arrays on the basis of their
125124
encoding procedure, i.e., the codecs declared in the ``filters`` and ``compressor`` attributes of array
@@ -195,14 +194,19 @@ API for the following operations:
195194
- Encoding and decoding a scalar value to and from Zarr V2 and Zarr V3 array metadata
196195
- Casting a Python object to a scalar value consistent with the data type
197196

198-
The following section lists the data types built into Zarr Python.
197+
List of data types
198+
^^^^^^^^^^^^^^^^^^
199199

200-
Boolean Types
201-
^^^^^^^^^^^^^
200+
The following section lists the data types built in to Zarr Python. With a few exceptions, Zarr
201+
Python supports nearly all of the data types in NumPy. If you need a data type that is not listed
202+
here, it's possible to create it yourself: see :ref:`adding-new-data-types`.
203+
204+
Boolean
205+
"""""""
202206
- `Boolean <../api/zarr/dtype/index.html#zarr.dtype.Bool>`_
203207

204-
Integral Types
205-
^^^^^^^^^^^^^^
208+
Integral
209+
""""""""
206210
- `Signed 8-bit integer <../api/zarr/dtype/index.html#zarr.dtype.Int8>`_
207211
- `Signed 16-bit integer <../api/zarr/dtype/index.html#zarr.dtype.Int16>`_
208212
- `Signed 32-bit integer <../api/zarr/dtype/index.html#zarr.dtype.Int32>`_
@@ -212,27 +216,38 @@ Integral Types
212216
- `Unsigned 32-bit integer <../api/zarr/dtype/index.html#zarr.dtype.UInt32>`_
213217
- `Unsigned 64-bit integer <../api/zarr/dtype/index.html#zarr.dtype.UInt64>`_
214218

215-
Floating-Point Types
216-
^^^^^^^^^^^^^^^^^^^^
219+
Floating-point
220+
""""""""""""""
217221
- `16-bit floating-point <../api/zarr/dtype/index.html#zarr.dtype.Float16>`_
218222
- `32-bit floating-point <../api/zarr/dtype/index.html#zarr.dtype.Float32>`_
219223
- `64-bit floating-point <../api/zarr/dtype/index.html#zarr.dtype.Float64>`_
220224
- `64-bit complex floating-point <../api/zarr/dtype/index.html#zarr.dtype.Complex64>`_
221225
- `128-bit complex floating-point <../api/zarr/dtype/index.html#zarr.dtype.Complex128>`_
222226

223-
String Types
224-
^^^^^^^^^^^^
227+
String
228+
""""""
225229
- `Fixed-length UTF-32 string <../api/zarr/dtype/index.html#zarr.dtype.FixedLengthUTF32>`_
226230
- `Variable-length UTF-8 string <../api/zarr/dtype/index.html#zarr.dtype.VariableLengthUTF8>`_
227231

228-
Byte String Types
229-
^^^^^^^^^^^^^^^^^
232+
Bytes
233+
"""""
230234
- `Fixed-length null-terminated bytes <../api/zarr/dtype/index.html#zarr.dtype.NullTerminatedBytes>`_
231235
- `Fixed-length raw bytes <../api/zarr/dtype/index.html#zarr.dtype.RawBytes>`_
232236
- `Variable-length bytes <../api/zarr/dtype/index.html#zarr.dtype.VariableLengthBytes>`_
233237

238+
Temporal
239+
""""""""
240+
- `DateTime64 <../api/zarr/dtype/index.html#zarr.dtype.DateTime64>`_
241+
- `TimeDelta64 <../api/zarr/dtype/index.html#zarr.dtype.TimeDelta64>`_
242+
243+
Struct-like
244+
"""""""""""
245+
- `Structured <../api/zarr/dtype/index.html#zarr.dtype.Structured>`_
246+
234247
Example Usage
235-
~~~~~~~~~~~~~
248+
^^^^^^^^^^^^^
249+
250+
This section will demonstrates the basic usage of Zarr data types.
236251

237252
Create a ``ZDType`` from a native data type:
238253

@@ -267,7 +282,7 @@ Serialize to JSON for Zarr V2:
267282
.. note::
268283

269284
The representation returned by ``to_json`` is more abstract than the literal contents of Zarr V2
270-
array metadata, because the JSON representation used by the `ZDType` classes must be distinct across
285+
array metadata, because the JSON representation used by the ``ZDType`` classes must be distinct across
271286
different data types. Zarr V2 identifies multiple distinct data types with the "object" data type
272287
identifier ``"|O"``, which means extra information is needed to disambiguate these data types from
273288
one another. That's the reason for the ``object_codec_id`` field you see here. See the
@@ -296,8 +311,10 @@ Deserialize a scalar value from JSON:
296311
>>> scalar_value = int8.from_json_scalar(42, zarr_format=3)
297312
>>> assert scalar_value == np.int8(42)
298313
314+
.. _adding-new-data-types:
315+
299316
Adding New Data Types
300-
~~~~~~~~~~~~~~~~~~~~~
317+
^^^^^^^^^^^^^^^^^^^^^
301318

302319
Each Zarr data type is a separate Python class that inherits from
303320
`ZDType <../api/zarr/dtype/index.html#zarr.dtype.ZDType>`_. You can define a custom data type by
@@ -311,7 +328,7 @@ Python project directory.
311328
:language: python
312329

313330
Data Type Resolution
314-
~~~~~~~~~~~~~~~~~~~~
331+
^^^^^^^^^^^^^^^^^^^^
315332

316333
Although Zarr Python uses a different data type model from NumPy, you can still define a Zarr array
317334
with a NumPy data type object:
@@ -379,8 +396,9 @@ a static lookup table, Zarr Python relies on a dynamic approach to data type res
379396
Zarr Python defines a collection of Zarr data types. This collection, called a "data type registry,"
380397
is essentially a dictionary where the keys are strings (a canonical name for each data type), and the
381398
values are the data type classes themselves. Dynamic data type resolution entails iterating over
382-
these data type classes, invoking a special class constructor defined on each one, and returning a
383-
concrete data type instance if and only if exactly one of those constructor invocations is successful.
399+
these data type classes, invoking that class' `from_native_dtype <#api/dtype/ZDType.from_native_dtype>`_
400+
method, and returning a concrete data type instance if and only if exactly one of those constructor
401+
invocations is successful.
384402

385403
In plain language, we take some user input, like a NumPy data type, offer it to all the
386404
known data type classes, and return an instance of the one data type class that can accept that user input.

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,8 @@ docs = [
109109
'numcodecs[msgpack]',
110110
'rich',
111111
's3fs>=2023.10.0',
112-
'astroid<4'
112+
'astroid<4',
113+
'pytest'
113114
]
114115

115116

src/zarr/core/dtype/npy/bool.py

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,14 +21,17 @@
2121
@dataclass(frozen=True, kw_only=True, slots=True)
2222
class Bool(ZDType[np.dtypes.BoolDType, np.bool_], HasItemSize):
2323
"""
24-
A Zarr data type for arrays containing booleans. Wraps the NumPy
25-
``np.dtypes.BoolDType`` data type. Scalars for this data type are instances of ``np.bool_``.
24+
A Zarr data type for arrays containing booleans.
25+
26+
Wraps the ``np.dtypes.BoolDType`` data type. Scalars for this data type are instances of
27+
``np.bool_``.
2628
2729
Attributes
2830
----------
31+
2932
_zarr_v3_name : Literal["bool"] = "bool"
3033
The Zarr v3 name of the dtype.
31-
_zarr_v2_name : Literal["|b1"] = "|b1"
34+
_zarr_v2_name : ``Literal["|b1"]`` = ``"|b1"``
3235
The Zarr v2 name of the dtype, which is also a string representation
3336
of the boolean dtype used by NumPy.
3437
dtype_cls : ClassVar[type[np.dtypes.BoolDType]] = np.dtypes.BoolDType
@@ -97,7 +100,7 @@ def _check_json_v2(
97100
98101
Returns
99102
-------
100-
TypeGuard[DTypeConfig_V2[Literal["|b1"], None]]
103+
``TypeGuard[DTypeConfig_V2[Literal["|b1"], None]]``
101104
True if the input is a valid JSON representation, False otherwise.
102105
"""
103106
return (
@@ -192,7 +195,7 @@ def to_json(
192195
193196
Returns
194197
-------
195-
DTypeConfig_V2[Literal["|b1"], None] or Literal["bool"]
198+
``DTypeConfig_V2[Literal["|b1"], None] | Literal["bool"]``
196199
The JSON representation of the Bool instance.
197200
198201
Raises
@@ -233,7 +236,7 @@ def cast_scalar(self, data: object) -> np.bool_:
233236
234237
Returns
235238
-------
236-
np.bool_
239+
``np.bool_``
237240
The numpy boolean scalar.
238241
239242
Raises
@@ -252,7 +255,7 @@ def default_scalar(self) -> np.bool_:
252255
253256
Returns
254257
-------
255-
np.bool_
258+
``np.bool_``
256259
The default value.
257260
"""
258261
return np.False_
@@ -288,7 +291,7 @@ def from_json_scalar(self, data: JSON, *, zarr_format: ZarrFormat) -> np.bool_:
288291
289292
Returns
290293
-------
291-
np.bool_
294+
``np.bool_``
292295
The numpy boolean scalar.
293296
294297
Raises

0 commit comments

Comments
 (0)