Skip to content

Can not create Metadata for structured dtype containing subarray dtype #3583

@sehoffmann

Description

@sehoffmann

Zarr version

v3.1.3

Numcodecs version

Python Version

3.12

Operating System

Linux

Installation

uv

Description

Output:

 Traceback (most recent call last):
  File "bug.py", line 19, in <module>
    arr = zarr.create_array(store, name='test', shape=(10,), dtype=DTYPE)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "zarr/api/synchronous.py", line 962, in create_array
    sync(
  File "zarr/core/sync.py", line 159, in sync
    raise return_result
  File "zarr/core/sync.py", line 119, in _runner
    return await coro
           ^^^^^^^^^^
  File "zarr/core/array.py", line 4933, in create_array
    return await init_array(
           ^^^^^^^^^^^^^^^^^
  File "zarr/core/array.py", line 4747, in init_array
    meta = AsyncArray._create_metadata_v3(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "zarr/core/array.py", line 772, in _create_metadata_v3
    fill_value_parsed = dtype.default_scalar()
                        ^^^^^^^^^^^^^^^^^^^^^^
  File "zarr/core/dtype/npy/structured.py", line 419, in default_scalar
    return self._cast_scalar_unchecked(0)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "zarr/core/dtype/npy/structured.py", line 373, in _cast_scalar_unchecked
    res = np.array([data], dtype=na_dtype)[0]
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: a bytes-like object is required, not 'int'

Case #2 fill_value=(0, np.nan):

zarr/core/dtype/npy/structured.py", line 371, in _cast_scalar_unchecked
    res = np.array([tuple(data)], dtype=na_dtype)[0]
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: a bytes-like object is required, not 'float'

Case #3 fill_value=0
TypeError: a bytes-like object is required, not 'int'

Case #4 fill_value=None
TypeError: a bytes-like object is required, not 'int'

Case #5: fill_value={}

zarr/core/metadata/v3.py", line 236, in __init__
    fill_value_parsed = data_type.cast_scalar(fill_value)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "zarr/core/dtype/npy/structured.py", line 406, in cast_scalar
    raise TypeError(msg)
TypeError: Cannot convert object {} with type <class 'dict'> to a scalar compatible with the data type Structured(fields=(('a', Int32(endianness='little')), ('b', RawBytes(length=1600)))).

Case #6: fill_value={'a': 1, 'b': np.nan}
This was possible before in zarr v2.x

zarr/core/dtype/npy/structured.py", line 406, in cast_scalar
    raise TypeError(msg)
TypeError: Cannot convert object {'a': 1, 'b': nan} with type <class 'dict'> to a scalar compatible with the data type Structured(fields=(('a', Int32(endianness='little')), ('b', RawBytes(length=1600)))).

Steps to reproduce

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "zarr@git+https://github.com/zarr-developers/zarr-python.git@main",
# ]
# ///
#
# This script automatically imports the development branch of zarr to check for issues

import zarr
from zarr.storage import LocalStore
import numpy as np

DTYPE = np.dtype([('a', 'i4'), ('b', 'f4', (20,20))])

store = LocalStore('bug.zarr')
arr = zarr.create_array(store, name='test', shape=(10,), dtype=DTYPE)

Expected behavior: same as np.empty (preferred) or np.zero when using a structured dtype

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions