I got interested in computer vision for architectural floor plans. CubiCasa5k is a great public dataset for this — 5,000 floor plan images with structured annotations. Here's what I learned training YOLOv8 on it.
- Dataset exploration: What CubiCasa5k actually looks like, class distributions, annotation quality
- Object detection training: Converting SVG annotations to YOLO format, training YOLOv8 from scratch
- Honest evaluation: Per-class metrics, failure cases, and what surprised me
- Reusable utilities: Dataset loader, visualization tools, evaluation helpers
- Python 3.9+
- GPU recommended (training took ~2 hours on an RTX 3080)
- ~3 GB disk space for the dataset
# Clone this repo
git clone https://github.com/DevontiaW/floorplan-detection-lab.git
cd floorplan-detection-lab
# Install dependencies
pip install -r requirements.txt
# Download CubiCasa5k (see docs/DATASET_GUIDE.md for details)
# Then run the notebooks in order:
# 1. notebooks/01_dataset_exploration.ipynb
# 2. notebooks/02_train_yolov8.ipynb
# 3. notebooks/03_evaluation.ipynbAfter 50 epochs of training on CubiCasa5k:
| Class | mAP@0.5 | Notes |
|---|---|---|
| Wall | ~0.75 | High contrast, consistent appearance — the "easy" win |
| Door | ~0.60 | Decent, but swing arcs cause confusion |
| Window | ~0.55 | Often thin and low-contrast |
| Room | ~0.45 | Large, overlapping regions are tricky for bbox detection |
| Stair | ~0.40 | Rare class, high variance in representation |
These numbers are approximate and will vary with your hardware and random seeds. The notebooks walk through the full story — including the parts that didn't work well.
Key challenges:
- Scale variation across floor plans is massive (studio apartments vs. commercial buildings)
- Small fixtures are nearly invisible at 640px input resolution
- SVG-to-bbox conversion is lossy — CubiCasa's annotations are polygon-based, not box-based
@inproceedings{kalervo2019cubicasa5k,
title={CubiCasa5K: A Dataset and an Improved Multi-task Model for Floorplan Image Analysis},
author={Kalervo, Ahti and Ylioinas, Juha and H{\"a}iki{\"o}, Markus and Karhu, Antti and Kannala, Juho},
booktitle={Scandinavian Conference on Image Analysis},
pages={475--486},
year={2019},
organization={Springer}
}This project is licensed under the Apache License 2.0 — see LICENSE for details.
The CubiCasa5k dataset has its own license — please check their repository for terms.