|
| 1 | +=================================== |
| 2 | + String data types (version 1.0) |
| 3 | +=================================== |
| 4 | +----------------------------- |
| 5 | + Editor's draft 2 March 2022 |
| 6 | +----------------------------- |
| 7 | + |
| 8 | +Specification URI: |
| 9 | + http://purl.org/zarr/spec/protocol/extensions/object-dtypes/1.0 |
| 10 | +Issue tracking: |
| 11 | + `GitHub issues <https://github.com/zarr-developers/zarr-specs/labels/object-dtypes-v1.0>`_ |
| 12 | +Suggest an edit for this spec: |
| 13 | + `GitHub editor <https://github.com/zarr-developers/zarr-specs/blob/core-protocol-v3.0-dev/docs/protocol/extension/object-dtypes/v1.0.rst>`_ |
| 14 | + |
| 15 | +Copyright 2022 `Zarr core development |
| 16 | +team <https://github.com/orgs/zarr-developers/teams/core-devs>`_ (@@TODO |
| 17 | +list institutions?). This work is licensed under a `Creative Commons |
| 18 | +Attribution 3.0 Unported |
| 19 | +License <https://creativecommons.org/licenses/by/3.0/>`_. |
| 20 | + |
| 21 | +---- |
| 22 | + |
| 23 | + |
| 24 | +Abstract |
| 25 | +======== |
| 26 | + |
| 27 | +This specification is a Zarr protocol extension defining a data type where each |
| 28 | +element is an arbitrary Python object. |
| 29 | + |
| 30 | + |
| 31 | +Status of this document |
| 32 | +======================= |
| 33 | + |
| 34 | +This document is a **Work in Progress**. It may be updated, replaced |
| 35 | +or obsoleted by other documents at any time. It is inappapropriate to |
| 36 | +cite this document as other than work in progress. |
| 37 | + |
| 38 | +Comments, questions or contributions to this document are very |
| 39 | +welcome. Comments and questions should be raised via `GitHub issues |
| 40 | +<https://github.com/zarr-developers/zarr-specs/labels/object-dtypes-v1.0>`_. When |
| 41 | +raising an issue, please add the label "object-dtypes-v1.0". |
| 42 | + |
| 43 | +This document was produced by the `Zarr core development team |
| 44 | +<https://github.com/orgs/zarr-developers/teams/core-devs>`_. |
| 45 | + |
| 46 | + |
| 47 | +Document conventions |
| 48 | +==================== |
| 49 | + |
| 50 | +Conformance requirements are expressed with a combination of |
| 51 | +descriptive assertions and [RFC2119]_ terminology. The key words |
| 52 | +"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", |
| 53 | +"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative |
| 54 | +parts of this document are to be interpreted as described in |
| 55 | +[RFC2119]_. However, for readability, these words do not appear in all |
| 56 | +uppercase letters in this specification. |
| 57 | + |
| 58 | +All of the text of this specification is normative except sections |
| 59 | +explicitly marked as non-normative, examples, and notes. Examples in |
| 60 | +this specification are introduced with the words "for example". |
| 61 | + |
| 62 | + |
| 63 | +Object data types |
| 64 | +================= |
| 65 | +NumPy's object arrays are arrays where each element is an arbitrary Python |
| 66 | +object. The array elements correspond to the |
| 67 | +`numpy.object_ <https://numpy.org/doc/1.22/reference/arrays.scalars.html#numpy.object_>` |
| 68 | +type which has character code `'O'`. A common concrete use case for this type |
| 69 | +is to have an array where each element is another array (and each array can |
| 70 | +have a different length). Another use case is to store an array of variable |
| 71 | +length strings. It is important to note that such an array actually just stores the references to the Python objects and not the objects themselves. Accessing |
| 72 | +an element of the array returns the Python object it refers to. |
| 73 | + |
| 74 | +Data Types added by this extension |
| 75 | +================================== |
| 76 | + |
| 77 | +.. list-table:: Data types |
| 78 | + :header-rows: 1 |
| 79 | + |
| 80 | + * - Identifier |
| 81 | + - Numerical type |
| 82 | + - Size (no. bytes) |
| 83 | + - Byte order |
| 84 | + * - ``O`` (uppercase letter o) |
| 85 | + - 8 (TODO: I assume this is actually a hardware-dependent memory address size?) |
| 86 | + - address of a Python object |
| 87 | + - None |
| 88 | + |
| 89 | + |
| 90 | +References |
| 91 | +========== |
| 92 | + |
| 93 | +.. [NumPy] NumPy Data type objects. NumPy version 1.22.0 |
| 94 | + documentation. URL: |
| 95 | + https://numpy.org/doc/1.22/reference/arrays.dtypes.html |
| 96 | +
|
| 97 | +.. [H5Py variable length strings] Variable length strings |
| 98 | + documentation. URL: |
| 99 | + https://docs.h5py.org/en/stable/special.html#variable-length-strings |
| 100 | +
|
| 101 | +Change log |
| 102 | +========== |
| 103 | + |
| 104 | +@@TODO |
| 105 | + |
0 commit comments