Skip to content

Commit ac1843c

Browse files
Binary writer, improved tilecutting, documentation update (#21)
* Added packages * cleaned up example * Commit test * Added pre-commit hooks * Formatting * Updated example * Update package * Added multilevel metadata test * Bump version 0.1.8 -> 0.1.9 * Added mkdocstrings pip * Added trigger * Added handler * Added provenance * Added provenance * Added provenance * Added provenance * Added documentation and tests for provenance * Corrected string * Added TileJSON for MicroJSON * Added examples for tiling and description * Removed garbage files * Restructured tile examples * Updated TileJSON fields, added relative file URL * Bump version 0.1.9 -> 0.1.10 * Initial protobuf prototype * Initial roadmap * Updated naming and documentation * Harmonized with OME NGFF multiscale * Reverted indexing in utils.py * Added hierarchies * Bump version 0.1.10 -> 0.1.11 * Updated packages * Corrected roadmap markdown * Added module examples * Moved module examples * Corrected model reference * Revised roadmap * Replaced GeoJSON objects to geojson-pydantic * Altered documentation after replacing GeoJSON * Initial protobuf encoder * Corrected version * Bump version 0.2.0 -> 0.3.0 * Corrected version * updated gitignore * cleanup * Updated structure, writer * Added doc link * Bump version 0.3.0 -> 0.3.1 * Corrected bumpver * Update dependencies, improve code formatting, and enhance documentation * Reduce grid size and enhance coordinate simplification logic to ensure minimum vertex requirements are met * Moved multiscale object to TileJSON * Bump version 0.3.1 -> 0.3.2 * Corrected bump * Removed reference to multiscale object * Update field range for num_vertices in example * Updated tiling example * Add field extraction functionality to tiling example and tilewriter * Removed debugging * Remove multiscale metadata from test * Modified gitignore * Bump version 0.3.2 -> 0.3.3 * Corrected dependency * Added tiled square option * Added reader example * Add doc for TileReader * Bump version 0.3.3 -> 0.4.0 * Corrected bfio version * Added project urls * Bump version 0.4.0 -> 0.4.1 * Updated bfio version --------- Co-authored-by: Nicholas-Schaub <[email protected]>
1 parent 901f765 commit ac1843c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+1857
-447837
lines changed

.bumpversion.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[bumpversion]
2-
current_version = 0.3.0
2+
current_version = 0.4.1
33
commit = True
44
tag = False
55
parse = (?P<major>\\d+)\\.(?P<minor>\\d+)\\.(?P<patch>\\d+)(\\-(?P<release>[a-z]+)(?P<dev>\\d+))?

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,9 @@ tiles/
99
*/__pycache__/
1010
*.pyc
1111
.vscode/
12+
dist/
13+
.mypy_cache/
14+
.hintrc
15+
*.zip
16+
testdata/
1217

docs/about.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
**About MicroJSON**
1+
# About MicroJSON
22

3-
MicroJSON is a lightweight, human-readable, and easy-to-parse data format for representing geospatial and multidimensional data. It is designed to be simple and efficient, making it suitable for a wide range of applications, from web mapping to scientific data analysis.
3+
MicroJSON is a lightweight, human-readable, and easy-to-parse data format for representing geospatial and multidimensional data. It is designed to be simple and efficient, making it suitable for a wide range of applications, from web mapping to scientific data analysis.

docs/example.md

Lines changed: 4 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# MicroJSON Examples
2+
23
## Basic MicroJSON
4+
35
This JSON file demonstrates how MicroJSON can be used to define and describe different structures related to imaging, such as cells and their nuclei, including their spatial relationships, identifiers, labels, and color representations.
46

57
```json
@@ -30,40 +32,7 @@ This JSON file demonstrates how MicroJSON can be used to define and describe dif
3032
"ratioInfectivity": [0.1, 0.2, 0.3, 0.4, 0.5]
3133
}
3234
}
33-
],
34-
"multiscale": {
35-
"axes": [
36-
{
37-
"name": "x",
38-
"unit": "micrometer",
39-
"type": "space",
40-
"description": "x-axis"
41-
},
42-
{
43-
"name": "y",
44-
"unit": "micrometer",
45-
"type": "space",
46-
"description": "y-axis"
47-
}
48-
],
49-
"transformationMatrix": [
50-
[
51-
1.0,
52-
0.0,
53-
0.0
54-
],
55-
[
56-
0.0,
57-
1.0,
58-
0.0
59-
],
60-
[
61-
0.0,
62-
0.0,
63-
0.0
64-
]
65-
]
66-
}
35+
]
6736
}
6837

69-
```
38+
```

docs/index.md

Lines changed: 3 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -8,33 +8,24 @@ MicroJSON is a format, inspired by [GeoJSON](https://geojson.org), for encoding
88

99
### MicroJSON Object
1010

11-
A MicroJSON object is a JSON object that represents a geometry, feature, or collection of features, or more precisely, be either of type (having value of top level field `type` as) `"Geometry"`, `"Feature"`, or `"Featurecollection"`, that is, the same as for GeoJSON. What separates MicroJSON from GeoJSON is that it may have a member `"multiscale"` in a Feature or FeatureCollection object:
12-
13-
- `"multiscale"`: (Optional) A multiscale object as defined in the section [Multiscale object](#multiscale-object). If this property is not present, the default coordinate system is assumed to be the same as the image coordinate system, using cartesian coordinates and pixels as units. It is recommended to define this property at the top level of the MicroJSON object, but it may also be defined at the level of a Feature or Geometry object, in which case it overrides the top level coordinate system.
11+
A MicroJSON object is a JSON object that represents a geometry, feature, or collection of features, or more precisely, be either of type (having value of top level field `type` as) `"Geometry"`, `"Feature"`, or `"Featurecollection"`, that is, the same as for GeoJSON.
1412

1513
A MicroJSON object may have a `"bbox"` property":
1614

1715
- `"bbox"`: (Optional) Bounding Box of the feature represented as an array of length 4 (2D) or length 6 (3D).
1816

19-
2017
### Geometry Object
2118

2219
A geometry object is a JSON object where the `type` member's value is one of the following strings: `"Point"`, `"MultiPoint"`, `"LineString"`, `"MultiLineString"`, `"Polygon"`, `"Rectangle"`, `"MultiPolygon"`, or `"GeometryCollection"`.
2320

24-
Each geometry object MUST have a `"coordinates"` member with an array value. The structure of the coordinates array varies with the geometry type. The innermost point coordinates array MUST contain two or three (if 3D) numbers representing the X and Y (and Z) coordinates of the point in the image. These coordinates follow the same order as the axes in [Multiscale object](#multiscale-object). Please note that these coordinates differ from the GeoJSON specification, where the order is longitude, latitude, and optionally altitude. If no multiscale object is defined, the default coordinate system is assumed to be the same as the image coordinate system, using cartesian coordinates and pixels as units, with the origin at the top left corner of the image, and the x-axis pointing to the right and the y-axis pointing down. The z-axis points into the image, with the origin at the top left corner of the image.
21+
Each geometry object MUST have a `"coordinates"` member with an array value. The structure of the coordinates array varies with the geometry type. The innermost point coordinates array MUST contain two or three (if 3D) numbers representing the X and Y (and Z) coordinates of the point in the image. These coordinates follow the same order as the axes in [Multiscale object](#multiscale-object). Please note that these coordinates differ from the GeoJSON specification, where the order is longitude, latitude, and optionally altitude. If no multiscale object is defined, the default coordinate system is assumed to be the same as the image coordinate system, using cartesian coordinates and pixels as units, with the origin at the top left corner of the image, and the x-axis pointing to the right and the y-axis pointing down. The z-axis points into the image, with the origin at the top left corner of the image.
2522

2623
- **Point**: Must be a single set of point coordinates. A “Point” Geometry may have a radius, if representing a circular object, with the value in pixels, specified as a member `“radius”` of the Geometry object.
27-
2824
- **MultiPoint**: The coordinates array must be an array of point coordinates.
29-
3025
- **LineString**: The coordinates array must be an array of two or more point coordinates forming a continuous line. A “LineString” Geometry may have a radius, with the value in pixels, specified as a member “radius” of the Geometry object.
31-
3226
- **MultiLineString**: The coordinates array must be an array of LineString coordinate arrays.
33-
3427
- **Polygon**: The coordinates array must be an array of linear ring point coordinate arrays, where the first linear ring represents the outer boundary and any additional rings represent holes within the polygon.
35-
3628
- A subtype of “Polygon” is the “Rectangle” geometry: A polygon with an array of four 2D point coordinates representing the corners of the rectangle in a counterclockwise order. It has the property subtype with the value `“Rectangle”`.
37-
3829
- **MultiPolygon**: The coordinates array must be an array of Polygon coordinate arrays.
3930

4031
### GeometryCollection
@@ -53,50 +44,24 @@ A feature object represents a spatially bounded entity associated with propertie
5344
- `"parentId"`: (Optional) A reference to the parent feature, e.g. the id of the feature that this feature is a part of.
5445
- `"feeatureClass"`: (Optional) A string indicating the class of the feature, e.g. "cell", "nucleus", "mitochondria", etc.
5546

56-
5747
#### Special Feature Objects
5848

5949
- **Image**: An image MUST have the following key-value pairs in its “properties” object:
6050

6151
- `"type"`: A string with the value “Image”
62-
6352
- `"URI"`: A string with the image URI, e.g. “./image_1.tif"
64-
6553
An Image MUST also have a geometry object (as its “geometry” member) of type "Polygon", subtype “Rectangle”, indicating the shape of the image. An Image may have the following additional key-value pairs in its “properties” object:
6654
- `"correction"`: A list of coordinates indicating the relative correction of the image, e.g. `[1, 2]` indicating a correction of 1 units in the x direction and 2 units in the y direction, with units as defined by the coordinate system. If the coordinate system is not defined, the units are pixels.
6755

6856
### FeatureCollection Object
6957

7058
A FeatureCollection object is a JSON object representing a collection of feature objects. A FeatureCollection object has a member with the name `"features"`. The value of `"features"` is a JSON array. Each element of the array is a Feature object as defined above. It is possible for this array to be empty. Additionally, it may have the following members:
59+
7160
- `"properties"`: (Optional) A JSON object containing properties and metadata specific to the feature collection, and which apply to all features of the collection, or a JSON null value. It has the same structure as the `"properties"` member of a Feature object.
7261

7362
#### Special FeatureCollection Objects
7463

7564
- **StitchingVector**: Represents a stitching vector, and MUST have the following key-value pairs in its “properties” object:
76-
7765
- `"type"`: A string with the value “StitchingVector”
7866

7967
Any object of a StitchingVector “features” array MUST be an “Image” special type of features object.
80-
81-
### Multiscale Object
82-
83-
A multiscale object represents the choice of axes (2-5D) and potentially their transformations that should be applied to the numerical data in order to arrive to the actual size of the object described. It MUST have the following properties:
84-
85-
- `"axes"`: Representing the choice of axes as an array of Axis objects.
86-
87-
It may contain either of, but NOT both of the following properties:
88-
- `"coordinateTransformations"`: Representing the set of coordinate transformations that should be applied to the numerical data in order to arrive to the actual size of the object described. It MUST be an array of objects, each object representing a coordinate transformation. Each object MUST have properties as follows:
89-
- `"type"`: Representing the type of the coordinate transformation. Currently supported types are `"identity"`, `"scale"`, and `"translate"`. If the type is `"scale"`, the object MUST have the property `"scale"`, representing the scaling factor. It MUST be an array of numbers, with the number of elements equal to the number of axes in the coordinate system. If the type is `"translate"`, the object MUST have the property `"translate"`, representing the translation vector. It MUST be an array of numbers, with the number of elements equal to the number of axes in the coordinate system. If the type is `"identity"`, the object MUST NOT have any other properties.
90-
- `"transformationMatrix"`: Representing the transformation matrix from the coordinate system of the image to the coordinate system of the MicroJSON object. It MUST be an array of arrays of numbers, with the number of rows equal to the number of axes in the coordinate system, and the number of columns equal to the number of axes in the image coordinate system. The transformation matrix MUST be invertible.
91-
92-
93-
### Axis Object
94-
95-
Together with the other axes in the axes array, an axis object represents the coordinate system of the MicroJSON object (2D-5D)
96-
It MUST have the following properties:
97-
- `"name"`: Representing the name of the axis. It MUST be a string.
98-
It may contain the following properties:
99-
- `"unit"`: Representing the units of the corresponding axis of the geometries in the MicroJSON object. It MUST be an array with the elements having any of the following values: `[“angstrom", "attometer", "centimeter", "decimeter", "exameter", "femtometer", "foot", "gigameter", "hectometer", "inch", "kilometer", "megameter", "meter", "micrometer", "mile", "millimeter", "nanometer", "parsec", "petameter", "picometer", "terameter", "yard", "yoctometer", "yottameter", "zeptometer", "zettameter“]`
100-
- `"description"`: A string describing the axis.
101-
102-

docs/license.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
1-
**MIT License**
1+
# MIT License
22

3-
```
43
Copyright (c) 2024 PolusAI
54

65
Permission is hereby granted, free of charge, to any person obtaining a copy
@@ -20,4 +19,3 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
2019
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
2120
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
2221
SOFTWARE.
23-
```

docs/metadata_example.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,9 @@
22

33
This guide demonstrates how to designate metadata in MicroJSON using the `properties` field in the `Feature` class. The `properties` field is used to store metadata related to a feature. This guide provides examples of how to populate these fields in both JSON and Python.
44

5-
## Properties Class Overview
6-
7-
In MicroJSON, metadata related to a feature is stored in the `Properties` class. This class has
8-
95
Now, let's explore an example to understand how these fields can be populated in both JSON and Python.
106

11-
### JSON Example
7+
## JSON Example
128

139
```json
1410
{
@@ -40,7 +36,7 @@ Now, let's explore an example to understand how these fields can be populated in
4036
}
4137
```
4238

43-
### Python Example
39+
## Python Example
4440

4541
```python
4642
from microjson.model import MicroFeature, Properties

docs/ome.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Extending OME-NGFF with Tiled Data Models: Integrating TileJSON, MicroJSON, and Vector Tiling
2+
3+
## Introduction
4+
5+
[OME Next-Generation File Format](https://ngff.openmicroscopy.org/latest/) is a format for storing bioimaging data in the cloud, for which there is a growing need to integrate with established vector tiling formats widely used in geospatial applications. Vector tiling formats like [Mapbox Vector Tiles (MVT)](https://github.com/mapbox/vector-tile-spec) and [MicroJSON](https://polusai.github.io/microjson/) and tiling descriptors like [TileJSON](https://github.com/mapbox/tilejson-spec/tree/master/3.0.0), which has been [adapted to MicroJSON](https://polusai.github.io/microjson/tiling/) provide a standardized way to access and visualize large geospatial datasets. This document explores how these tiling models can be integrated with OME-NGFF.
6+
7+
The **TileJSON** serves as a endpoint mapping that can bridge between NGFF’s chunked multiscale data structures and other tiling models. Tiles may then be of different formats:
8+
9+
- **MicroJSON**: MicroJSON may be used for each tile, using its intrinsic coordinate system to annotate the features of the tile.
10+
- **JSON Vector tiles**: JSON representations of vector data, like Mapbox Vector Tiles (MVT) in JSON format.
11+
- **Binary tile formats** encoded either like Mapbox Vector Tiles (MVT, protobuf-encoded tiles), or GeoParquet (Apache Parquet-based vector tiles).
12+
13+
## Background: OME-NGFF
14+
15+
**OME-NGFF** provides a standardized way to store large image datasets (in for example Zarr format) with metadata describing scale transformations, coordinate systems, and multiple resolution levels. Each resolution level consists of a set of chunked arrays, allowing efficient partial retrieval and processing of large images by tiled raster data.
16+
17+
Key features of NGFF:
18+
19+
- **Multidimensional data** (2-5D): time (t), channel (c), z-depth (z), and spatial dimensions (y, x).
20+
- **Multiscale pyramids**, each level providing a different resolution.
21+
- **Flexible coordinate transformations** that can place data in a variety of coordinate reference systems (CRS).
22+
23+
## TileJSON as an endpoint mapping Layer
24+
25+
**TileJSON** is a well-established specification within the geospatial community that describes tiled data sources via a simple JSON schema. It is commonly used in web mapping to:
26+
27+
- Reference a set of tiled resources (raster or vector) defined by zoom levels and tile coordinates (z/x/y). The standard order differs from NGFF’s (z/y/x).
28+
- Provide metadata such as bounding boxes, attribution, min/max zoom levels, and tile endpoints.
29+
30+
While TileJSON has traditionally been used in 2D applications, there is nothing that hinders it from being used with higher dimensions, including 5D as with NGFF. The MicroJSON implementation of TileJSON outlines such usage. It thus can be used to map endpoints for a coordinate system that is not strictly geospatial, given the following:
31+
32+
1. NGFF’s multiscale pyramids could be mapped directly to TileJSON’s `zoom` levels, however, TileJSON assumes a zoom factor of 2, while NGFF may have arbitrary scale factors. This may require additional adaptions of the TileJSON schema, to allow for arbitrary scale factors. This could be done by defining a new `scale_factor` field in the TileJSON schema. Followingly, the `Multiscale` class in the MicroJSON should be moved to the TileJSON schema.
33+
2. Associate NGFF array chunks with tile indices (z/x/y) derived from spatial transformations, given the NGFF multiscale metadata and the corresponding TileJSON metadata.
34+
3. A practical observation is that if the multiscale pyramids differs, transformations between the two systems are needed, in addition to what is described above, which could be avoided by using the same multiscale pyramid structure in both systems. This is also valid for the tile size (expressed in the global coordinates), which should be the same in both systems for a specific zoom level.
35+
4. Layering of data is supported in TileJSON, as the array `vector_layers` in its schema. Layers are thus stored together for each tile, as a contrast to the NGFF, where the layers are stored in separate arrays as labeled images.
36+
5. The NGFF raster data hierarchy could be expressed with a TileJSON which does not have a vector layer but intead just maps the raster data endpoints as formatted in the field `tiles` in the TileJSON schema.
37+
38+
## Incorporating MicroJSON and Vector Tiles
39+
40+
**MicroJSON** is a valid GeoJSON but with a few extra additions, including for example feature classes and parent-relations between features. It aligns well with TileJSON as individual tiles could be specified in MicroJSON, but with the same agnostic coordinate system as for the binary vector tiles, and the intermediate vector tile JSON.
41+
42+
**Vector Tiles** is a binary format for encoding vector data in tiles, either using protobuf, or directly as vector tile JSON, or some other format, like GeoParquet. Protobuf is widely used in geospatial applications to express them in compact form. The MVT format can be used to represent vector data in a tile-based system, with each tile containing a subset of the vector data. The MVT format is well-suited for representing vector data in a tile-based system, as it allows for efficient storage and retrieval of vector data.
43+
44+
## NGFF Labels and Vector Tiles
45+
46+
NGFF labels can be represented as vector tiles, where each tile contains a subset of the labels. This allows for efficient storage and retrieval of labels, as well as the ability to overlay labels on top of raster data. The vector tiles can be stored in a tile-based system, with each tile containing a subset of the labels. This allows for efficient storage and retrieval of labels, as well as the ability to overlay labels on top of raster data. If desired, the labels can be stored both as raster data and as vector tiles, allowing for flexibility in how the labels are displayed for different tools and applications. It is strongly suggested to use the same identifier for the labels in the NGFF and the vector tiles, to allow for easy integration between the two systems.
47+
48+
## Conclusion
49+
50+
TileJSON may be used as a bridge between OME-NGFF and individual vector tiles, expressed in different formats. For practical use, it is optimal if zoom levels and tile sizes are consistent between the two systems.
51+
such as MicroJSON and vector tiles. The integration of these tiling models can provide a standardized way to access and visualize large geospatial datasets, while also allowing for the efficient storage and retrieval of vector data. By aligning the coordinate reference systems and handling higher-dimensional data through slicing or conventions, it is possible to create a seamless integration between OME-NGFF and other tiling models. Vector tiles may replace the NGFF labels, although they can be stored in parallel for flexibility.
52+
53+
The current version of MicroJSON must be adapted to support the NGFF data model, and the TileJSON schema should be extended to support arbitrary scale factors.

0 commit comments

Comments
 (0)