Skip to content

Real-time calibrated multi-camera 3D coordinate capture system featuring intrinsic/extrinsic camera calibration, multithreaded parallel processing, and robust markerless multi-person detection and tracking using YOLOv11, DeepSort, and MediaPipe Holistic for precise 3D landmark extraction across synchronized camera views.

License

Notifications You must be signed in to change notification settings

NeuroLabsIITH/Set-up-calibrated-multi-cam-3D-coordinate-capture-environment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Set up Calibrated Multi-Cam 3D Coordinate Capture Environment for Clinical Scoring

Executive Summary

This repository presents a comprehensive, production-grade framework for real-time, markerless, multi-camera, and multi-person 3D coordinate capture, specifically tailored for quantitative clinical scoring environments. The system is meticulously designed to support established clinical protocols such as ARAT (Action Research Arm Test), UPDRS (Unified Parkinson's Disease Rating Scale), and Fugl-Meyer, with robust handling of scenarios involving both patients and clinical scorers within the same field of view.

Leveraging state-of-the-art computer vision and deep learning techniques-including YOLOv11 for person detection, DeepSort for multi-person tracking, and MediaPipe Holistic for detailed landmark extraction-this pipeline enables precise, synchronized, and scalable kinematic data acquisition across multiple calibrated cameras. Extensive statistical validation and visualization modules ensure the reliability and interpretability required for clinical and research-grade deployments.


Table of Contents


Features

  • Multi-Camera Calibration: Robust routines for intrinsic and extrinsic calibration with chessboard patterns, supporting high-precision 3D triangulation and undistortion.
  • Real-Time, Multi-Person Detection and Tracking: Seamless integration of YOLOv11 and DeepSort enables reliable detection and persistent tracking of multiple individuals (patients, scorers) even during bounding box overlaps.
  • Markerless 3D Landmark Extraction: MediaPipe Holistic provides real-time extraction of face, pose, and hand landmarks for each detected individual, across all cameras.
  • Parallelized Processing: ThreadPoolExecutor-based multithreading ensures optimal frame rates on multi-core systems, scaling efficiently with the number of persons and cameras.
  • Structured Data Output: Per-person, per-camera CSVs for face, pose, and hand coordinates, including frame-level temporal alignment.
  • Comprehensive Statistical Validation: Automated error quantification, advanced visualizations (histograms with KDE, CDFs, boxplots, Bland–Altman, regression, 3D scatter), and CSV-based reporting for calibration and tracking quality.
  • Clinical Scoring Support: Output data is directly compatible with ARAT, UPDRS, Fugl-Meyer, and other clinical movement scoring protocols.
  • Scalable and Extensible: Modular design enables adaptation to additional cameras, scoring protocols, or custom downstream analysis.

System Architecture

  • Input: Multiple synchronized video streams or live camera feeds, with both patient(s) and scorer(s) in frame.
  • Calibration: Intrinsic and extrinsic camera calibration using chessboard images, with automatic error metrics and visual feedback.
  • Detection & Tracking: YOLOv11 for initial person detection, DeepSort for track consistency across frames and cameras.
  • Landmark Extraction: MediaPipe Holistic (face, pose, hands) per detected ROI, executed in parallel threads.
  • Data Aggregation: Global 3D coordinates computed for each domain (face, pose, hand) and each tracked individual, stored per camera.
  • Statistical Analysis: Automated validation routines for calibration and tracking, with detailed plots and CSV summaries.
  • Output: Structured CSVs and visualizations for downstream clinical or research analysis.

Installation

  1. Clone the repository:
  git clone https://github.com/aryanbhardwaj24/Set-up-calibrated-multi-cam-3D-coordinate-capture-environment.git

  cd Set-up-calibrated-multi-cam-3D-coordinate-capture-environment
  1. Install dependencies:
  pip install -r requirements.txt

Directory Structure

.
├── capstone.py
├── utils.py
├── requirements.txt
├── calibration_input_images_cam1/
├── calibration_input_images_cam2/
├── calibration_input_videos/
├── calibration_compare_stats/
│ ├── face/
│ ├── pose/
│ └── hand/
├── multiperson_multithread_multicam/
│ ├── cam1/
│ └── cam2/
└── ...
  • capstone.py: Main pipeline and entry point for all operations.
  • utils.py: Utility functions (FPS calculation, drawing, etc.).
  • requirements.txt: All Python dependencies.
  • calibration_input_images_camX/: Chessboard images for camera X calibration.
  • calibration_input_videos/: Raw video files for calibration and capture.
  • calibration_compare_stats/: Statistical results and plots for calibration quality.
  • multiperson_multithread_multicam/: Output CSVs for each person/camera.

Usage

Camera Calibration

  1. Capture Chessboard Images:

    • Acquire 10–20 high-quality chessboard images per camera from diverse angles.
    • Store images in calibration_input_images_cam1/, calibration_input_images_cam2/, etc.
  2. Run Calibration:

    • Calibration is handled automatically at runtime. The system will:
      • Detect chessboard corners.
      • Calculate intrinsic (camera matrix, distortion) and extrinsic parameters.
      • Save undistorted images and error metrics for verification.
  3. Review Calibration Quality:

    • Visual and statistical outputs (mean, std, per-axis and Euclidean errors) are generated in calibration_compare_stats/.

Single/Multi-Person Capture

  • Single Camera (Single/Multi-Person):
  python capstone.py --device 0 --width 960 --height 540
  • The pipeline detects, tracks, and extracts landmarks for all persons, saving results to CSV.

Multi-Camera, Multi-Person Capture

  • Full Multi-Camera, Multi-Person, Multi-Threaded Mode:
  python capstone.py
  • By default, main_multiperson_multithreaded_multicam() is invoked.
  • The system synchronizes two cameras, detects and tracks multiple persons with DeepSort, and extracts landmarks in parallel for each person and camera.

Command-Line Arguments

Argument Description Default
--device Camera device index or video file 0
--width Frame width 960
--height Frame height 540
--upper_body_only Only track upper body False
--min_detection_confidence Min confidence for detection 0.5
--min_tracking_confidence Min confidence for tracking 0.5
--use_brect Draw bounding rectangles False

Data Output

  • CSV Files: For each tracked individual and each camera, CSVs are generated for face, pose, left hand, and right hand. Each row contains: frame, x, y, z.
  • FPS Logs: Per-frame FPS is logged for performance profiling.
  • Calibration Stats: Per-frame and summary statistics for calibration accuracy are output as CSV and plots.

Visualization and Statistical Analysis

  • Basic Plots:
  • Raw vs. calibrated coordinates (per axis)
  • Error vs. frame with mean ± std deviation bands
  • Euclidean error vs. frame
  • Advanced Plots:
  • Histograms with KDE for error distributions
  • Cumulative Distribution Function (CDF) plots
  • Box-and-whisker plots
  • Bland–Altman plots for agreement analysis
  • Scatter plots with regression and Pearson correlation
  • 3D scatter plots of raw vs. calibrated points
  • Automated Generation: All plots are auto-generated and organized by domain (face, pose, hand) and camera.

Clinical Application Notes

  • Multi-Person, Multi-Role Support: Designed for real-world clinical environments, robustly handling multiple patients and scorers in the same field of view, even under occlusion and bounding box overlap.
  • Protocol Compatibility: Output data is directly applicable to ARAT, UPDRS, Fugl-Meyer, and other movement scoring protocols, supporting both quantitative and qualitative analysis.
  • Data Integrity and Reliability: Calibration and error quantification routines ensure high confidence in 3D kinematic data, suitable for clinical research and regulatory requirements.

Troubleshooting

  • Calibration Issues: Ensure clear, well-lit chessboard images with sufficient positional diversity. Confirm checkerboard size matches code configuration.
  • Performance Bottlenecks: Adjust the number of threads in ThreadPoolExecutor to match available CPU cores. Reduce video resolution if necessary.
  • Tracking Loss: Maintain adequate lighting and minimize occlusions. DeepSort is robust but may be challenged by severe overlaps or rapid movement.
  • Landmark Extraction Failures: MediaPipe Holistic may struggle with extreme poses or occlusions. Consider integrating additional models for specialized scenarios.

References


Acknowledgements

This project was made possible by the continuous guidance and mentorship of Dr. Mohan Raghavan, who provided the opportunity and support to pursue this work over a dedicated four-month period. The development also builds upon the collective innovations of the open-source community in computer vision, deep learning, and clinical biomechanics.


License

This project is licensed under the MIT License - see the LICENSE file for details.


Author

Aryan Bhardwaj


For questions, feature requests, or contributions, please open an issue or contact the author directly.


Note:
This repository is intended for advanced users, clinical researchers, and developers seeking a robust, extensible, and high-performance solution for multi-person, multi-camera 3D kinematic capture in clinical scoring environments. For detailed API documentation, refer to the code comments and function docstrings in capstone.py.

About

Real-time calibrated multi-camera 3D coordinate capture system featuring intrinsic/extrinsic camera calibration, multithreaded parallel processing, and robust markerless multi-person detection and tracking using YOLOv11, DeepSort, and MediaPipe Holistic for precise 3D landmark extraction across synchronized camera views.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages