- 
                Notifications
    
You must be signed in to change notification settings  - Fork 32
 
Description
As discussed in zarr-developers/zarr-python#877, it would be great to have sharding support in zarr. The current prototype is based around translating chunk-access into partial-read/writes of shards, where shards are one unit in a store. This needs some translation logic between an array chunk and a storage unit (plus making sure that this can happen efficiently, e.g. by bundling multiple chunks into one access-request). Atm there are prototypes for this logic to be either
- part of the array class
Sharding Prototype II: implementation on Array zarr-python#947, or - added to the array class via a translation store (since a store already provides the exact API needed for the translation)
Sharding Prototype I: implementation as translating Store zarr-python#876, Sharding Prototype alimanfoo/zarrita#40 
I think it would be great to extend the specification to allow sharding, by either
- adding a specific sharding spec and terminology, or
 - adding a general "translation" interface, with the specific extension for sharding.
 
The second approach would hopefully lead to a standard way how to include such extensions. They still need broad compatibility and support across implementations, but having a common terminology and specs for similar extensions might be worthwhile. Related issues seem to be #62, #115, #82, #49, #76.
I'm not quite familiar with the process around the specs. Besides opinions about the approaches and terminology, I'd be happy about practical next steps to update the specs for sharding support.