Skip to content
Marko Mihajlovic edited this page Oct 5, 2025 · 1 revision

Spatial Mask Merging (SMM)

Welcome to the Spatial Mask Merging (SMM) wiki!
This space provides extended documentation, background, and usage examples for the SMM framework.


🧩 Overview

Spatial Mask Merging (SMM) implements a paper-faithful ILP-based correlation clustering algorithm for merging overlapping instance segmentation masks across image tiles.
It was developed for research in large-scale aerial/remote-sensing imagery and general dense instance detection.


📚 Contents

  1. Overview
  2. Installation
  3. Running SMM
  4. Visualization
  5. Evaluation
  6. Optimization
  7. Known Limitations
  8. Citation & License

⚙️ Installation

Clone the repository and install dependencies:

git clone https://github.com/mihajlov39547/spatial-mask-merging.git
cd spatial-mask-merging
pip install -r requirements.txt
# Optional extras
pip install torch pulp optuna rtree opencv-python-headless matplotlib

🚀 Running SMM

To merge overlapping instance masks:

python smm/smm.py --pred_dir /path/to/predictions --out_dir ./merged_jsons

Notes:

  • --iou_thresh controls the box-level IoU threshold.
  • --mask_iou_thresh controls mask-level overlap merging.
  • Set --method ilp for exact correlation clustering (default).

🖼️ Visualization

Visualize merged predictions or ground truth annotations:

python tools/visualization.py --pred_dir /path/to/pred_jsons --image_dir /path/to/images

Each image will produce a *_visualization.pdf overlay for qualitative inspection.


📊 Evaluation

Evaluate merged results against ground truth:

python tools/evaluation.py --pred_dir ./merged_preds --gt_dir ./gt_json --img_dir ./images --out_csv ./results/eval.csv

Outputs include:

  • Precision, Recall, F1, Dice, PQ
  • Average Fragments, Count Error, Mean Error

🔧 Optimization

Use the built-in Optuna optimizer to tune merging thresholds:

python tools/optimize_smm.py --pred_dir ./preds --gt_dir ./gt_json --n_trials 50

The optimizer runs correlation clustering for multiple parameter sets and reports the best-performing configuration.


⚠️ Known Limitations

  • Limited testing on large-scale datasets (iSAID, DIOR).
  • ILP backend (pulp) can be slow for high-resolution images.
  • Visualization scripts assume consistent annotation formats.

🧠 Citation & License

If you use this code in your research, please cite:

Marko Mihajlović, Spatial Mask Merging (SMM), 2025.
Copyright (c) 2025 Marko Mihajlović, with contributions from Marina Marjanović.
Licensed under the MIT License.


⬅ Back to Repository