Skip to content

Commit e98cbc0

Browse files
committed
add api implementation protocol
1 parent 720febb commit e98cbc0

File tree

3 files changed

+94
-5
lines changed

3 files changed

+94
-5
lines changed

docs/protocol/core/v3.0.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1200,7 +1200,7 @@ available data until the end of the referenced value. For example
12001200
support partial access can still answer the requests using cutouts
12011201
of full values. It is recommended that the implementation of the
12021202
``get_partial_values``, ``set_partial_values`` and
1203-
``remove_partial_values`` methods is made optional, providing fallbacks
1203+
``erase_values`` methods is made optional, providing fallbacks
12041204
for them by default. However, it is recommended to supply those operations
12051205
where possible for efficiency. Also, the ``get``, ``set`` and ``erase``
12061206
can easily be mapped onto their `partial_values` counterparts.
@@ -1230,7 +1230,8 @@ A **readable store** supports the following operation:
12301230

12311231
| Parameters: `key_ranges`: ordered set of `key`, `range` pairs,
12321232
| a `key` may occur multiple times with different `ranges`
1233-
| Output: list of `values`, in the order of the `key_ranges`
1233+
| Output: list of `values`, in the order of the `key_ranges`, may contain none
1234+
| for missing keys
12341235
12351236
A **writeable store** supports the following operations:
12361237

@@ -1253,7 +1254,7 @@ A **writeable store** supports the following operations:
12531254
| Parameters: `key`
12541255
| Output: none
12551256
1256-
``remove_partial_values`` - Erase the given key/value pair from the store.
1257+
``erase_values`` - Erase the given key/value pairs from the store.
12571258

12581259
| Parameters: `keys`: set of `keys`
12591260
| Output: none

docs/storage_transformers/sharding/v1.0.rst

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,94 @@ Any configuration parameters for the write strategy must not be part of the meta
177177
they need to be configured at runtime, as this is implementation specific.
178178

179179

180+
API implementation
181+
------------------
182+
183+
The section below defines an implementation of the
184+
:ref:`abstract-store-interface` in terms of the operations of this
185+
storage transformer as a ``StoreWithPartialAccess``.
186+
The term `underlying store` references either the next storage transformer
187+
in the stack or the actual store if this transformer is the last one in the
188+
stack. Any operations with keys not starting with ``data/root`` are simply
189+
relayed to the underlying store and not described explicitly.
190+
191+
* ``get_partial_values(key_ranges) -> values``:
192+
For each referenced key, request the indices from the underlying store using
193+
``get_partial_values``. For each `key`, `range` pair in in `key_ranges`,
194+
check if the chunk exists by checking if the index offset and length
195+
are both ``2^64 - 1``. For existing keys, request the actual chunks by
196+
their ranges as read from the index using ``get_partial_values``.
197+
This operation should be implemented using two ``get_partial_values``
198+
operations on the underlying store, one for retrieving the indices and
199+
one for retrieving existing chunks.
200+
201+
* ``set_partial_values(key_start_values)`` :
202+
For each referenced key, check if all available chunks in a shard are
203+
referenced. In this case a shard can be constructed according to the
204+
`Binary shard format`_ directly.
205+
For all other keys, request the indices from the underlying store using
206+
``get_partial_values``. All chunks that are not updated completely and
207+
exist according to the index (index offset and length are both
208+
``2^64 - 1``) need to be read via ``get_partial_values`` from the
209+
underlying store. For simplification purposes a shard may also be read
210+
completely, combining the previous two `get` operations into one.
211+
Based on the existing chunks and value ranges that need to be updated
212+
new shards are constructed according to the `Binary shard format`_.
213+
All shards that need to be updated must now be set via ``set`` or
214+
``set_partial_values(key_start_values)``, depending one the chosen
215+
writing strategy provided by the implementation.
216+
Specialized store implementations that allow appending to a storage
217+
object may only need to read the index to update it.
218+
219+
* ``erase_values(keys)`` :
220+
For each referenced key, check if all available chunks in a shard are
221+
referenced. In this case the full shard is removed using ``erase_values``
222+
on the underlying store.
223+
For all other keys, request the indices from the underlying
224+
store using ``get_partial_values``. Update the index using and offset and
225+
length of ``2^64 - 1`` to mark missing chunks. The updated index may be
226+
be written in-place using ``set_partial_values(key_start_values)``,
227+
or a larger rewrite of the shard may be done including the index update,
228+
but also removing value ranges corresponding to the erased chunks.
229+
230+
* ``erase_prefix()`` : If the prefix contains a part of the chunk-grid
231+
key, this part is translated to the referenced shard and contained chunks.
232+
For affected shards where all contained chunks are erased the prefix is
233+
rewritten to the corresponding shard key and the operation is relayed to
234+
the underlying store.
235+
For all shards where only some chunks are erased the affected chunks
236+
are removed by invoking the operation ``erase_values`` on this
237+
storage transformer with the respective chunk keys.
238+
239+
* ``list()``: See ``list_prefix`` with the prefix ``/``.
240+
241+
* ``list_prefix(prefix)`` : If the prefix contains a part of the chunk-grid
242+
key, this part is translated to the referenced shard and contained chunks.
243+
Then, ``list_prefix`` is called on the underlying store with the translated
244+
prefix. For all listed shards request the indices from the underlying store
245+
using ``get_partial_values``. Existing chunks, where the index offset or
246+
length are not ``2^64 - 1`` are then listed by their original key.
247+
248+
* ``list_dir(prefix)`` : If the prefix contains a part of the chunk-grid
249+
key, this part is translated to the referenced shard and contained chunks.
250+
Then, ``list_dir`` is called on the underlying store with the translated
251+
prefix. For all *retrieved prefixes* (not full keys) with partial shard keys,
252+
the corresponding original prefixes covering all possible chunks in the shard
253+
are listed. For *retrieved full keys* the the indices from the underlying store
254+
are requested using ``get_partial_values``. Existing chunks, where the index
255+
offset or length are not ``2^64 - 1`` are then listed by their original key.
256+
257+
.. note::
258+
Not all listed prefixes must necessarily contain keys, as shard prefixes with
259+
partially available chunks return prefixes for all possible chunks without
260+
verifying their exisence for performance reasons. Listing those prefixes
261+
is still safe as some chunks in their corresponding shard exist, but not
262+
necessarily in the requested prefix, possibly leading to empty responses.
263+
Please note, this only applies for returned prefixes, *not* for full keys
264+
referencing storage objects. Returned full keys always reflect the actually
265+
available chunks and are safe to request.
266+
267+
180268
References
181269
==========
182270

docs/stores/filesystem/v1.0.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,8 +149,8 @@ directory path is "C:\\data", then the file system path
149149
Store API implementation
150150
========================
151151

152-
The section below defines an implementation of the Zarr abstract store
153-
interface (@@TODO link) in terms of the native operations of this
152+
The section below defines an implementation of the Zarr
153+
:ref:`abstract-store-interface` in terms of the native operations of this
154154
storage system. Below ``fspath_to_key()`` is a function that
155155
translates file system paths to store keys, and ``key_to_fspath()`` is
156156
a function that translates store keys to file system paths, as defined

0 commit comments

Comments
 (0)