Skip to content

writing to a sharded array is 10x slower #3560

@nbren12

Description

@nbren12

Zarr version

3.1.3

Numcodecs version

n/a

Python Version

3.12

Operating System

Linux

Installation

uv

Description

The attached script shows that writing multiple chunks to an uncompressed sharded array is about 10x slower than writing to a non-sharded array. You can reproduce this bechmark with this command:

uv run benchmark_zarr_sharding.py --num-iterations 2 --shape 1000 100000 --chunks 1 100000 --shard-chunks 100 1000000 ^C

Some profiling with cProfile reveals that expenses is dominated by repeated calls to Buffer._add__ from _ShardWriter. Buffer.__add__ concatenates the arrays manually, which will results in many repeated allocations and copies.

            np.concatenate((np.asanyarray(self._data), np.asanyarray(other_array)))

It would probably be much faster to coalesce these concats into a single operation. A similar pattern is used for gpu buffers, so it seems a new api is required to support this use case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions