Support loading and saving keypoint annotations

**Context / mental model**
In the development of `ethology` we roughly consider 3 types of data: annotations, detections and tracks. They all refer to types of bounding boxes, keypoints or segmentation masks. Currently `ethology` focuses on bounding boxes, so in the following paragraphs focus mostly in them. However the definitions apply equally to keypoints or segmentation masks.

* _**Annotations**_ are labels drawn manually by a human. They are usually considered ground truth. They refer to images, and are often defined on non-consecutive frames of a video (but not necessarily, we may have annotations for consecutive frames too). Annotations may have an "identity" associated with them or not. If they do, annotations with the same identity across images refer to the same individual.  In `ethology` we consider annotations have no identities associated with them (at least for now).  

* _**Detections**_ are predictions generated by a trained computer vision model on images. As such, they have a confidence value associated with them (unlike annotations). They can refer to consecutive or non-consecutive frames of a video. In `ethology` we consider detections have no identities associated with them (at least for now).  

* _**Tracks**_ are detections that refer to consecutive frames in a video, with identities associated to them. This is the type of data that `movement` deals with (see [here](https://movement.neuroinformatics.dev/latest/user_guide/movement_dataset.html))

**Current status**
Currently `ethology` supports loading and saving bounding box annotations datasets (see [this example](https://ethology.neuroinformatics.dev/examples/coco_bbox_ethology_and_movement.html#sphx-glr-examples-coco-bbox-ethology-and-movement-py) for a realistic use case). An `ethology` bounding box annotations dataset is defined as follows:
- it has ``image_id``, ``space``, ``id`` as dimensions
- it has ``position`` and  ``shape`` as data variables / arrays.

This definition is captured in the [`ethology.io.annotations.validate.ValidBboxesDataset`](https://github.com/neuroinformatics-unit/ethology/blob/93bba53b9120deafc9db92883666f9f00ceef800/ethology/io/annotations/validate.py#L230C7-L230C25) class. An annotations dataset may have more dimensions or arrays, but these are the minimum requirements to consider it a "bounding box annotations dataset". The dimensions in the annotations dataset roughly correspond to the ``time``, ``space`` and ``individual`` dimensions in a  [`movement` dataset](https://movement.neuroinformatics.dev/latest/user_guide/movement_dataset.html)

As mentioned, `ethology` currently considers that annotations have no identities associated with them. The `id` dimension in the annotations dataset stores an ID for each annotation in an image, but this is not consistent across frames. The `id` dimension ranges from 0 to the maximum number of annotations per image in the full dataset. 

We would like to support loading and saving keypoint annotations into `ethology`.

**Describe the solution you'd like**
One way to support keypoint annotations in `ethology` could be as follows:

1. **Define a `ValidKeypointsDataset` class**
  We define an `ethology` keypoint annotations dataset to represent the data. Following the bounding boxes case, it could be defined as having:
    - ``image_id``, ``space``,  ``keypoint`` and ``id`` as required dimensions;
    - ``position``  as  required data variable / array.
  
    This definition could be captured in a `ValidKeypointsDataset` class in [`ethology/io/annotations/validate.py`](https://github.com/neuroinformatics-unit/ethology/blob/main/ethology/io/annotations/validate.py) (which would closely follow `ValidBboxesDataset`).

2. **Define loaders and exporters**
      We would maybe then add two modules under `ethology.io.annotations` called `load_keypoints.py` and `save_keypoints.py` (maybe `kpts`?). 
      
      In `load_keypoints.py` we would have functionality to read keypoint annotations files (say .slp) as an `ethology` keypoint annotations dataset. It is probably a good idea to use [`sleap-io`](https://github.com/talmolab/sleap-io) under the hood, since they support a large variety of files and it is a well maintained and nicely written repo. We can use their [`load_file` function](https://github.com/talmolab/sleap-io/blob/294bd1f9427b43417a67a2fa25f08229befe49a8/sleap_io/io/main.py#L541) to read a variety of keypoint files as [`Labels`](https://github.com/talmolab/sleap-io/blob/294bd1f9427b43417a67a2fa25f08229befe49a8/sleap_io/model/labels.py#L43) objects. Then we can implement the transform from this `Labels` object into an `ethology` annotations dataset.

    Similarly with `save_keypoints.py`, we can use `sleap-io`'s `Labels` object as an intermediate representation to then use their [`save_file` function](https://github.com/talmolab/sleap-io/blob/294bd1f9427b43417a67a2fa25f08229befe49a8/sleap_io/io/main.py#L630) and export


**Describe alternatives you've considered**
Suggestions are more than welcome.

**Additional context**
It would be nice to also include an example of usage for the gallery. Maybe a workflow of loading a keypoint annotations file, and doing some sanity checks to see if there are erroneous labels (e.g., expressing all keypoints in an egocentric coordinate system, to try to quickly identify outliers - see [here](https://github.com/neuroinformatics-unit/movement/issues/239#issuecomment-2436211702) for example)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support loading and saving keypoint annotations #112

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support loading and saving keypoint annotations #112

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions