ROBUST-MIPS: A Combined Skeletal Pose and Instance Segmentation Dataset for Laparoscopic Surgical Instruments
This repository provides implementation of surgical-tool pose estimation (based on ROBUST-MIPS). Built on top of MMPose v1.3.0.
Localisation of surgical tools constitutes a foundational building block for computer-assisted interventional technologies. Works in this field typically focus on training deep learning models to perform segmentation tasks. Performance of learning-based approaches is limited by the availability of diverse annotated data. We argue that skeletal pose annotations are a more efficient annotation approach for surgical tools, striking a balance between richness of semantic information and ease of annotation, thus allowing for accelerated growth of available annotated data. To encourage adoption of this annotation style, we present, ROBUST-MIPS, a combined tool pose and tool instance segmentation dataset derived from the existing ROBUST-MIS dataset. Our enriched dataset facilitates the joint study of these two annotation styles and allow head-to-head comparison on various downstream tasks. To demonstrate the adequacy of pose annotations for surgical tool localisation, we set up a simple benchmark using popular pose estimation methods and observe high-quality results. To ease adoption, together with the dataset, we release our benchmark models and custom tool pose annotation software.
🔗 Download ROBUST_MIPS_toolpose
🔗 Download tool-pose-annotation-gui
🔗 Download ROBUST_MIPS Dataset
The configuration files for the benchmarked models are located in:
ROBUST_MIPS_toolpose/surgicaltool_bm/configs
Since this project is built upon MMPose (which defaults to human pose estimation), specific overrides are required to adapt it for the ROBUST-MIPS dataset.
Taking the ROBUST-MIPS dataset as an example. keypoint_info defines the keypoints, skeleton_info defines the connection relationships between keypoints, and sigmas represents the per-keypoint localization uncertainty used for OKS calculation.
set NUM_KEYPOINTS=4, and update it in the model configuration
Since the Top-Down paradigm is a two-stage approach, the first stage requires generating bounding box detection results. In this implementation, mmdetection was used to porvided the bbox detection results. The corresponding results should be updated in the val_dataloader and test_dataloader.
Given the unordered nature of the surgical tool endpoints, when evaluating prediction results, we should select the optimal match for the tip pairs (considering both the standard and swapped assignments of Tip1 and Tip2). This metric is implemented via swap_coco_metric (surgicaltool_bm/custom_src/evaluation/metric/swap_coco_metric.py), which needs to be imported in the config file using custom_imports.
Since our annotation software (tool-pose-annotation-gui) generates a separate JSON annotation file for each image, we need to aggregate these independent, single-image annotation files into three unified annotation files corresponding to the training, validation, and testing sets. The aggregated format should align with COCO.
Importantly, during the generation of these annotation files, we modified the formula for calculating area. Instead of storing the standard area, we store utilities folder.
This project extends the MMPose v1.3.0 framework to support:
- Custom dataset:
ROBUST-MIPSfor laparoscopic frames - Custom evaluation metrics: custom OKS metric parameters
swap_coco_metric - Plug-and-play interface: uses MMPose’s
tools/train.py&tools/test.py
We provide a Dockerfile to lock in your Python/CUDA environment and install all dependencies.
Before you begin, ensure you have met the following requirements:
- Docker image Build the image based on the Dockerfile
- MMPose v1.3.0 installed (fork your copy and install in editable mode)
ROBUST_MIPS_toolpose/
├── surgicaltool_bm/ # Main package for surgical tool pose estimation
│ ├── configs/ # Configuration files
│ ├── custom_src/ # Custom source code and extensions
│ ├── tools/ # Utility scripts and training/testing tools
│ └── setup.py # Package setup script
│
├── dataset/ # ROBUST-MIPS
├── mmpose/ # MMPose lib
├── utilities/ # Processing tools for ROBUST-MIPS datasets
│
├── .dockerignore # Files ignored by Docker build
├── .gitignore # Files ignored by Git
├── build.sh # Shell script to build the environment
├── Dockerfile # Docker build file
└── README.md # Project documentation (this file)
The reorganizedata.py script restructures the dataset, originally organized in a hierarchical format as Training/Testing -> Surgery type -> Procedure -> Frame -> Images and annotations. It consolidates images and JSON annotation files into a simplified format:
rename_training/imgandrename_training/jsonfor training images and annotationsrename_val/imgandrename_val/jsonfor validation images and annotationsrename_testing/imgandrename_testing/jsonfor testing images and annotations
Converts all JSON files from training/val/testing into COCO format JSON files
Verifies contents of cocoformat_train/val/test.json and visualizes annotations
The ROBUST-MIPS dataset and our trained model weights are publicly available for download:
Your dataset directory should look like this:
dataset/
├── training/
│ ├── img/ # raw training images
│ └── json/ # any original per-image metadata
├── val/
│ ├── img/ # raw validation images
│ └── json/ # any original per-image metadata
├── testing/
│ ├── img/ # raw test images
│ └── json/ # any original per-image metadata
├── converted_detections_val.json # results of mmdetection model on val set
├── converted_detections_test.json # results of mmdetection model on testing set
├── cocoformat_train.json # COCO‐style train annotations
├── cocoformat_val.json # COCO‐style val annotations
└── cocoformat_test.json # COCO‐style test annotations
Training model:
cd /workspace/surgicaltool_bm
python /workspace/surgicaltool_bm/tools/train.py \
/workspace/surgicaltool_bm/configs/rtmpose-l_8xb256-420e_coco-256x192.py \
--resume
Testing:
cd /workspace/surgicaltool_bm
python /workspace/surgicaltool_bm/tools/test.py \
/workspace/surgicaltool_bm/configs/rtmpose-l_8xb256-420e_coco-256x192.py \
/workspace/surgicaltool_bm/work_dirs/best_coco_AP_epoch_285.pth \
--work-dir /workspace/surgicaltool_bm/work_dirs/test/rtmpose-l_8xb256-420e_coco-256x192 \
--show-dir outputs \
--out /workspace/surgicaltool_bm/work_dirs/test/rtmpose-l_8xb256-420e_coco-256x192/metric_results.json \
--cfg-options='model.test_cfg.output_heatmaps=True' \
@misc{han2025robustmipscombinedskeletalpose,
title={ROBUST-MIPS: A Combined Skeletal Pose and Instance Segmentation Dataset for Laparoscopic Surgical Instruments},
author={Zhe Han and Charlie Budd and Gongyu Zhang and Huanyu Tian and Christos Bergeles and Tom Vercauteren},
year={2025},
eprint={2508.21096},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.21096},
}
ROBUST-MIPS is realeased unde a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) license, which means that it will be publicly available for non-commercial usage.