|
1 | | -# The webKNOSSOS Wrapper Format |
2 | | -webKNOSSOS wrapper is a file format designed for large-scale, |
3 | | -three-dimensional voxel datasets. It was optimized for high-speed access |
4 | | -to data subvolumes, and supports multi-channel data and dataset |
5 | | -compression. |
| 1 | +# MATLAB-Zarr |
| 2 | +A Zarr implementation based on [zarrs](https://zarrs.dev) for MATLAB. |
6 | 3 |
|
7 | | -## Implementations |
8 | | -This repository contains reference implementations for the webKNOSSOS wrapper |
9 | | -format. Code is available for |
10 | | - |
11 | | -* C/C++ |
12 | | -* [Scala](https://github.com/scalableminds/webknossos-wrap/tree/master/scala) |
13 | | -* [MATLAB](https://github.com/scalableminds/webknossos-wrap/tree/master/matlab#webknossos-wrapper-for-matlab) |
14 | | -* [Rust](https://github.com/scalableminds/webknossos-wrap/tree/master/rust#webknossos-wrapper-core-library) |
15 | | -* [Python](https://github.com/scalableminds/webknossos-wrap/tree/master/python#webknossos-wrapper-for-python) |
16 | | - |
17 | | -The Python implementation is a binding around the C library and demonstrates how |
18 | | -wk-wrap files can be read and written from within other programming languages. |
19 | | - |
20 | | -## High-level description |
21 | | -Each file contains the data for a cube with side-length (CLEN) of FILE_CLEN |
22 | | -(MUST be a power of two; e.g., 1024) voxels. Within each file, the data is split |
23 | | -into smaller, non-overlapping cubes (called "blocks") with a side-length of |
24 | | -BLOCK_CLEN (MUST be a power of two; e.g., 32) voxels. |
25 | | - |
26 | | -To enable fast access to subvolumes of the voxel cube, blocks are stored in |
27 | | -Morton order. That is, |
28 | | -``` |
29 | | - block index 0 1 2 3 4 5 |
30 | | - block coordinates (0, 0, 0) (1, 0, 0) (0, 1, 0) (1, 1, 0) (0, 0, 1) (1, 0, 1) |
31 | | - 6 7 8 9 10 11 12 ... |
32 | | - (0, 1, 1) (1, 1, 1) (2, 0, 0) (3, 0, 0) (2, 1, 0) (3, 1, 0) (2, 0, 1) ... |
33 | | -``` |
34 | | - |
35 | | -For further information, see the Wikipedia entry on the [Z-order curve]( |
36 | | -https://en.wikipedia.org/wiki/Z-order_curve). |
37 | | - |
38 | | -## File format |
39 | | -Each wk-wrap file begins with a file header. Depending on the content of this |
40 | | -header, additional meta data MAY follow. The content of the file header and the |
41 | | -optional meta data MUST be sufficient to determine the offset and size (in |
42 | | -bytes) of each encoded block. |
43 | | - |
44 | | -### File header |
45 | | -Each wk-wrap file MUST begin with the following header: |
46 | | - |
47 | | -| | +0x00 | +0x01 | +0x02 | +0x03 | |
48 | | -|------|:-----------:|:-----------:|:-----------:|:-----------:| |
49 | | -| 0x00 | 'W' (0x57) | 'K' (0x4B) | 'W' (0x57) | version | |
50 | | -| 0x04 | perDimLog2 | blockType | voxelType | voxelSize | |
51 | | -| 0x08 | dataOffset | dataOffset | dataOffset | dataOffset | |
52 | | -| 0x0C | dataOffset | dataOffset | dataOffset | dataOffset | |
53 | | - |
54 | | -#### Header fields |
55 | | -* __version__ contains the wk-wrap format version as unsigned byte. At the time |
56 | | - of writing, the only valid version number is 0x01. |
57 | | -* __perDimLog2__ contains two 4-bit values (nibbles). The lower nibble |
58 | | - (`perDimLog2 & 0x0F`) contains __voxelsPerBlockDimLog2__, i.e., the |
59 | | - log2 of the number of voxels per block dimension. The higher nibble |
60 | | - (`(perDimLog2 & 0xF0) >> 4`) contains __blocksPerFileDimLog2__, i.e., |
61 | | - the log2 of the number of blocks per file dimension. Files and blocks |
62 | | - are three-dimensional. |
63 | | -* __blockType__ determines how the individual blocks were encoded. Valid values |
64 | | - are: 0x01 for RAW encoding, 0x02 for LZ4 compressed, and 0x03 for the high- |
65 | | - compression version of LZ4. |
66 | | -* __voxelType__ encodes the data type of the voxel values. Valid values are |
67 | | - |
68 | | - | value of voxelType | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | |
69 | | - |-----------------------|-------|--------|--------|--------|-------|--------| |
70 | | - | data type | uint8 | uint16 | uint32 | uint64 | float | double | |
71 | | - |
72 | | -* __voxelSize__ is an uint8 of the number of bytes per voxel. If the wk-wrap |
73 | | - file contains a single value per voxel, then voxelSize is equal to the byte |
74 | | - size of the data type. If the wk-wrap file, however, contains multiple |
75 | | - channels (e.g., three 8-bit values for RGB), voxelSize is a multiple of the |
76 | | - data type size (specified as byte count). In the example of three 8-bit values, |
77 | | - voxelSize would be 3. |
78 | | -* __dataOffset__ contains the absolute address of the first byte of the first |
79 | | - block (relative to the beginning of the file) as unsigned 64-bit integer. |
80 | | - |
81 | | -### Byte order |
82 | | -Except when noted otherwise, multi-byte voxel values are stored in little-endian |
83 | | -order. That is, bytes are stored in order of increasing significance. |
84 | | - |
85 | | -### Raw blocks |
86 | | -Within raw blocks, the voxel values are stored in Fortran order. That is, |
87 | | -``` |
88 | | - voxel index X + Y * BLOCK_CLEN + Z * BLOCK_CLEN * BLOCK_CLEN |
89 | | - voxel coordinates (X, Y, Z) |
90 | | -``` |
91 | | - |
92 | | -In wk-wrap version 0x01, the block with index 0 begins immediately after the |
93 | | -fixed header. The bytes of subsequent blocks are immediately following each |
94 | | -other (i.e., no padding). |
95 | | - |
96 | | -### LZ4 compressed blocks |
97 | | -If the file header indicates that the blocks were compressed using LZ4 (by |
98 | | -having a blockType value of either 0x02 or 0x03), the file header is immediately |
99 | | -followed by the jump table. |
100 | | - |
101 | | -The jump table is an array of N unsigned 64-bit integers, where N is the number |
102 | | -of blocks in the file. The n-th entry of the jump table contains the absolute |
103 | | -address (relative to the beginning of the file) of the first byte after the data |
104 | | -of block n. |
105 | | - |
106 | | -Note that |
107 | | -* the data of block n begins at address jumpTable[n - 1] |
108 | | -* the data of block n is jumpTable[n] - jumpTable[n - 1] bytes long |
109 | | - |
110 | | -The value of jumpTable[-1] is defined as dataOffset. For this reason, it is |
111 | | -convenient to build an extended jump table with the N + 1 unsigned 64-bit |
112 | | -integers starting at the position of the dataOffset field. |
113 | | - |
114 | | -Decompression is identical for the standard and the high-compression variants of |
115 | | -LZ4. For a wk-wrap reader, the difference between blockType 0x02 and 0x03 is |
116 | | -only semantic. |
| 4 | +## Credits |
117 | 5 |
|
118 | | -Decompression must produce valid raw blocks. |
| 6 | +Uses the Rust-based [zarrs](https://zarrs.dev) library for the Zarr IO. |
119 | 7 |
|
120 | | -## Credits |
121 | | -* [Max Planck Institute for Brain Research](https://brain.mpg.de/) |
122 | | - - Alessandro Motta |
123 | | - - Manuel Berning |
124 | | -* [scalable minds](https://scm.io/) |
125 | | - - Johannes Frohnhofen |
126 | | - - Tom Bocklisch |
127 | | - - Norman Rzepka |
| 8 | +Uses the Rust-MATLAB binding originally developed for the [WKW format](https://github.com/scalableminds/webknossos-wrap). Developed by Alessandro Motta at the [Max Planck Institute for Brain Research](https://brain.mpg.de/) |
128 | 9 |
|
129 | 10 | # License |
130 | 11 | MIT |
0 commit comments