Skip to content

High level dataset API overview

R Schwanhold edited this page Jan 22, 2020 · 2 revisions

This wiki page gives you a basic understanding of the high-level dataset API.
For more detailed examples see webknossos-cuber/tests/test_dataset.py.

There are two different dataset types (WKDataset and TiffDataset) which support a very similar interface.

The essential operations for datasets are creating, opening, reading data and writing data. The datasource-properties.json gets updated automatically.
Here are some examples for working with the high-level dataset API:

Creating a WKDataset:

ds = WKDataset.create("path_to_dataset/wk_dataset", scale=(1, 1, 1))
ds.add_layer("color", "color")

ds.get_layer("color").add_mag("1")
ds.get_layer("color").add_mag("2-2-1")

# The directories are created automatically
assert path.exists("path_to_dataset/wk_dataset/color/1")
assert path.exists("path_to_dataset/wk_dataset/color/2-2-1")

assert len(ds.properties.data_layers) == 1
assert len(ds.properties.data_layers["color"].wkw_magnifications) == 2

Similar to the WKDataset, this also works for TiffDatasets:

ds = TiffDataset.create("path_to_dataset/tiff_dataset", scale=(1, 1, 1))
ds.add_layer("color", Layer.COLOR_TYPE)

ds.get_layer("color").add_mag("1")
ds.get_layer("color").add_mag("2-2-1")

# The directories are created automatically
assert path.exists("path_to_dataset/tiff_dataset/color/1")
assert path.exists("path_to_dataset/tiff_dataset/color/2-2-1")

assert len(ds.properties.data_layers) == 1
assert len(ds.properties.data_layers["color"].wkw_magnifications) == 2

Opening datasets:

wk_ds = WKDataset("path_to_dataset/wk_dataset")
...

tiff_ds = TiffDataset("path_to_dataset/tiff_dataset")
...

Reading and writing data (this also works for the TiffDataset):

wk_ds = WKDataset("path_to_dataset/wk_dataset")
mag = wk_ds.add_layer("another_layer", Layer.COLOR_TYPE, num_channels=3).add_mag("1")

data = (np.random.rand(3, 250, 250, 250) * 255).astype(np.uint8)
mag.write(data)

assert np.array_equal(data, mag.read(size=(250, 250, 10)))

The high-level dataset API also introduces the concept of a View. A View is a handle to a specific bounding box in the dataset. Views can be used to read and write data. The advantage is that Views can be passed around.

wk_view = WKDataset("path_to_dataset/wk_dataset").get_view(
     "another_layer", 
     "1", 
     size=(32, 32, 32),
     offset=(10,10,10)
)

data = (np.random.rand(3, 20, 20, 20) * 255).astype(np.uint8)
wk_view.write(data)
...

Clone this wiki locally