3D Bounding Box Detection System

A ROS 2 system for real-time 3D object detection using YOLO-based 2D detection combined with RGB-D camera depth information.

Overview

This system integrates YOLO V8 (Ultralytics) or YOLO V3 (Darknet) for 2D object detection with gb_visual_detection_3d (3D bounding box calculation) to provide complete 3D localization of detected objects in real-world coordinates.

🚀 YOLO V8 is now the default!

The system now uses YOLO V8 (Ultralytics) by default, providing:

10-20% higher accuracy than YOLO V3
2-3x faster inference on the same hardware
Smaller model sizes (30x smaller than YOLOv3)
Better small object detection

See YOLO_V8_MIGRATION.md for migration details.

How It Works

2D Detection: darknet_ros uses YOLO neural network to detect objects in RGB images
3D Projection: gb_visual_detection_3d maps 2D bounding boxes to 3D space using point cloud data
Output: 3D bounding boxes with position (x,y,z) and dimensions in meters

RGB Image → YOLO Detection → 2D Bounding Boxes
                                      ↓
Point Cloud → Depth Information → 3D Bounding Boxes

Prerequisites

Hardware

RGB-D Camera (one of the following):
- Intel RealSense (D435, D455, etc.)
- Asus Xtion Pro
- Orbbec Astra Pro (Now tested)
- Any camera publishing RGB image + PointCloud2

Software

Ubuntu 22.04
ROS 2 Humble
CUDA (optional, for GPU acceleration)
OpenCV
Eigen3

Installation

1. Clone the Repository

git clone git@github.com:HUOJIAXI/3D_bounding_box_gb_visual_detection.git gb_ws
cd gb_ws

2. Install Dependencies

# Install ROS 2 dependencies
sudo apt update
sudo apt install -y \
    ros-humble-cv-bridge \
    ros-humble-vision-opencv \
    ros-humble-image-transport \
    ros-humble-tf2-ros \
    ros-humble-tf2-sensor-msgs \
    ros-humble-visualization-msgs

# Install system dependencies
sudo apt install -y \
    libopencv-dev \
    libeigen3-dev

3. Install YOLO V8 Dependencies (Recommended)

# Install Ultralytics YOLO V8 dependencies
pip3 install -r src/ultralytics_ros/requirements.txt

# Or install manually:
pip3 install ultralytics>=8.0.0 opencv-python>=4.5.0 numpy>=1.20.0 torch>=2.0.0 torchvision>=0.15.0

Note: YOLO V8 models are automatically downloaded on first run. No manual download needed!

3b. Download YOLO V3 Weights (Legacy - Optional)

If you need to use YOLO V3:

# Download YOLOv3 weights (or use the provided script)
python3 download_weights.py

# Or manually download:
cd src/darknet_ros/darknet_ros/yolo_network_config/weights/
wget https://pjreddie.com/media/files/yolov3.weights

4. Build the Workspace

cd gb_ws
source /opt/ros/humble/setup.bash
colcon build --symlink-install
source install/setup.bash

Important: Make sure to build ultralytics_ros package:

colcon build --packages-select ultralytics_ros
source install/setup.bash

Configuration

Camera Topics

Edit the configuration files to match your camera topics:

1. darknet_ros Configuration

File: src/darknet_ros/darknet_ros/config/ros.yaml

darknet_ros:
  ros__parameters:
    subscribers:
      camera_reading:
        topic: /camera/color/image_raw  # Change to your RGB topic

2. darknet_ros_3d Configuration

File: src/gb_visual_detection_3d/darknet_ros_3d/config/darknet_3d.yaml

darknet3d_node:
  ros__parameters:
    darknet_ros_topic: /darknet_ros/bounding_boxes
    output_bbx3d_topic: /darknet_ros_3d/bounding_boxes
    point_cloud_topic: /camera/depth/points  # Change to your point cloud topic
    working_frame: camera_link                # Change to your camera frame
    maximum_detection_threshold: 0.8          # Max depth variation (meters)
    minimum_probability: 0.1                  # Min detection confidence
    interested_classes: ["person", "book", "bottle"]  # Objects to detect

Parameters Explained

Parameter	Description	Default
`point_cloud_topic`	PointCloud2 topic from RGB-D camera	`/camera/depth/points`
`working_frame`	TF frame for 3D coordinates	`camera_link`
`maximum_detection_threshold`	Max depth difference within object (m)	0.8
`minimum_probability`	Minimum detection confidence (0-1)	0.1
`interested_classes`	List of object classes to process	`["person", "book", "bottle"]`

Usage

Quick Start

# Terminal 1: Start your camera driver (example for RealSense)
ros2 launch realsense2_camera rs_launch.py \
    enable_depth:=true \
    enable_color:=true \
    pointcloud.enable:=true

# Terminal 2: Run the complete 3D detection system
cd ~/gb_ws
source install/setup.bash
./run_3d_detection.sh

Or use the launch file directly (YOLO V8 by default):

source install/setup.bash
ros2 launch darknet_ros_3d darknet_ros_3d_complete.launch.py

YOLO V8 Usage (Recommended)

Default launch (YOLO V8 Medium):

ros2 launch darknet_ros_3d darknet_ros_3d_complete.launch.py

Specify model variant:

# YOLO V8 Nano (fastest)
ros2 launch darknet_ros_3d darknet_ros_3d_complete.launch.py yolo_model:=yolov8n

# YOLO V8 Large (more accurate)
ros2 launch darknet_ros_3d darknet_ros_3d_complete.launch.py yolo_model:=yolov8l

# YOLO V11 Medium (latest generation)
ros2 launch darknet_ros_3d darknet_ros_3d_complete.launch.py yolo_model:=yolov11m

Direct YOLO V8 launch:

ros2 launch darknet_ros_3d yolov8_3d.launch.py

Use YOLO V3 (legacy):

# Option 1: Use launch argument
ros2 launch darknet_ros_3d darknet_ros_3d_complete.launch.py use_yolov8:=false

# Option 2: Use dedicated YOLO V3 launch file
ros2 launch darknet_ros_3d yolov3_3d.launch.py

Choose YOLO Version: See YOLO_SELECTION_GUIDE.md for detailed comparison and selection guide.

Step-by-Step Launch

If you prefer to run components separately:

With YOLO V8:

# Terminal 1: Camera driver
ros2 launch realsense2_camera rs_launch.py

# Terminal 2: YOLO V8 2D object detection
source install/setup.bash
ros2 launch ultralytics_ros yolov8m.launch.py

# Terminal 3: 3D bounding box calculation
source install/setup.bash
ros2 launch darknet_ros_3d darknet_ros_3d.launch.py

With YOLO V3 (legacy):

# Terminal 1: Camera driver
ros2 launch realsense2_camera rs_launch.py

# Terminal 2: YOLO V3 2D object detection
source install/setup.bash
ros2 launch darknet_ros darknet_ros.launch.py

# Terminal 3: 3D bounding box calculation
source install/setup.bash
ros2 launch darknet_ros_3d darknet_ros_3d.launch.py

Visualization in RViz2

# Terminal 4: Launch RViz
rviz2

RViz Configuration:

Set Fixed Frame to: camera_link (or your camera frame)
Add displays:
- Image → Topic: /darknet_ros/detection_image (2D detections)
- PointCloud2 → Topic: /camera/depth/points (depth data)
- MarkerArray → Topic: /darknet_ros_3d/markers (3D bounding boxes)

ROS 2 Topics

Published Topics

Topic	Message Type	Description
`/darknet_ros/bounding_boxes`	`darknet_ros_msgs/BoundingBoxes`	2D detections
`/darknet_ros/detection_image`	`sensor_msgs/Image`	Annotated image with 2D boxes
`/darknet_ros_3d/bounding_boxes`	`gb_visual_detection_3d_msgs/BoundingBoxes3d`	3D detections
`/darknet_ros_3d/markers`	`visualization_msgs/MarkerArray`	3D boxes for RViz

Subscribed Topics

Topic	Message Type	Required
`/camera/color/image_raw`	`sensor_msgs/Image`	✅ RGB image
`/camera/depth/points`	`sensor_msgs/PointCloud2`	✅ Point cloud

Message Format: BoundingBox3d

string object_name       # Object class (e.g., "person")
float64 probability      # Detection confidence [0-1]
float64 xmin, xmax      # X-axis bounds (meters)
float64 ymin, ymax      # Y-axis bounds (meters)
float64 zmin, zmax      # Z-axis bounds (meters)

Monitoring and Debugging

Check Active Topics

# List all topics
ros2 topic list

# Check if camera is publishing
ros2 topic hz /camera/color/image_raw
ros2 topic hz /camera/depth/points

# View 2D detections
ros2 topic echo /darknet_ros/bounding_boxes

# View 3D detections
ros2 topic echo /darknet_ros_3d/bounding_boxes

Verify TF Frames

# Check TF tree
ros2 run tf2_tools view_frames

# Check specific transform
ros2 run tf2_ros tf2_echo base_link camera_link

Performance Monitoring

# CPU and GPU usage
htop
nvidia-smi  # If using GPU

# Node computational performance
ros2 run rqt_top rqt_top

Troubleshooting

Issue: No 3D bounding boxes appearing

Possible Causes:

Point cloud not publishing
```
ros2 topic hz /camera/depth/points
```
Object class not in interested_classes
- Edit config/darknet_3d.yaml to add desired classes
Probability threshold too high
- Lower minimum_probability in config/darknet_3d.yaml
TF frame mismatch
- Verify working_frame matches your camera's frame name

Issue: darknet_ros not detecting objects

Solutions:

Check if weights file exists:

ls src/darknet_ros/darknet_ros/yolo_network_config/weights/

Verify camera topic is correct in config/ros.yaml
Check image quality and lighting conditions

Issue: "Transform error" messages

Solution:

Ensure camera driver publishes TF transforms
Check TF tree: ros2 run tf2_tools view_frames
Verify working_frame parameter matches actual frame name

Issue: High CPU usage

Solutions:

Use GPU acceleration (requires CUDA)
Use smaller YOLO model (yolov3-tiny)
Reduce camera resolution
Increase detection threshold to process fewer objects

System Architecture

┌─────────────────┐
│   RGB-D Camera  │
└────────┬────────┘
         │
    ┌────┴─────────────────┐
    │                      │
┌───▼────────┐    ┌────────▼────────┐
│ RGB Image  │    │  Point Cloud    │
│ (640x480)  │    │  (PointCloud2)  │
└───┬────────┘    └────────┬────────┘
    │                      │
    │                      │
┌───▼─────────────┐        │
│  darknet_ros    │        │
│  (YOLO Detection)│       │
└───┬─────────────┘        │
    │                      │
    │ 2D Bounding Boxes    │
    │                      │
    └──────┬───────────────┘
           │
    ┌──────▼──────────────────┐
    │  darknet_ros_3d         │
    │  (3D Projection)        │
    │  - Map 2D→3D            │
    │  - Calculate bounds     │
    │  - Filter outliers      │
    └──────┬──────────────────┘
           │
    ┌──────┴───────────────────┐
    │                          │
┌───▼──────────────┐  ┌────────▼────────┐
│ 3D Bounding Boxes│  │  RViz Markers   │
│ (x,y,z + size)   │  │  (Visualization)│
└──────────────────┘  └─────────────────┘

Supported Object Classes

Default YOLO (COCO dataset) supports 80 classes including:

person, bicycle, car, motorcycle, bus, truck
bottle, cup, bowl, chair, couch, bed
laptop, mouse, keyboard, cell phone, book
Full list

Edit interested_classes parameter to filter specific objects.

Performance Tips

Use GPU: Significant speedup with CUDA-enabled darknet
Optimize camera resolution: 640x480 is usually sufficient
Limit interested_classes: Process only needed objects
Adjust thresholds: Balance accuracy vs. false positives
Use yolov3-tiny: Faster but less accurate alternative

Citation

This project uses:

darknet_ros by Leggedrobotics
gb_visual_detection_3d by Intelligent Robotics Labs

License

darknet_ros: BSD License
gb_visual_detection_3d: Apache 2.0 License

Contributing

Feel free to submit issues and pull requests to improve the system.

Support

For issues and questions:

Check the Troubleshooting section above
Review ROS 2 logs: ros2 run rqt_console rqt_console
Open an issue on GitHub

Maintained by: HUOJIAXI Contact: JasonHuoeclille@gmail.com Repository: https://github.com/HUOJIAXI/3D_bounding_box_gb_visual_detection

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
src		src
.gitignore		.gitignore
CLUSTERING_IMPLEMENTATION.md		CLUSTERING_IMPLEMENTATION.md
ENABLE_GPU.md		ENABLE_GPU.md
PERSON_TRACKING_QUICKSTART.md		PERSON_TRACKING_QUICKSTART.md
QUICK_GPU_SETUP.md		QUICK_GPU_SETUP.md
QUICK_REFERENCE.md		QUICK_REFERENCE.md
QUICK_START_YOLOV8.md		QUICK_START_YOLOV8.md
README.md		README.md
YOLO_SELECTION_GUIDE.md		YOLO_SELECTION_GUIDE.md
YOLO_V8_MIGRATION.md		YOLO_V8_MIGRATION.md
check_gpu_usage.sh		check_gpu_usage.sh
coordinate_system.png		coordinate_system.png
cuda-keyring_1.1-1_all.deb		cuda-keyring_1.1-1_all.deb
cuda-keyring_1.1-1_all.deb.1		cuda-keyring_1.1-1_all.deb.1
download_weights.py		download_weights.py
generate_architecture_diagram.py		generate_architecture_diagram.py
install_cuda_and_rebuild.sh		install_cuda_and_rebuild.sh
presentation.tex		presentation.tex
run_3d_detection.sh		run_3d_detection.sh
system_architecture.png		system_architecture.png
test_result.png		test_result.png
yolov8m.pt		yolov8m.pt
yolov8n.pt		yolov8n.pt

Folders and files

Latest commit

History

Repository files navigation

3D Bounding Box Detection System

Overview

🚀 YOLO V8 is now the default!

How It Works

Prerequisites

Hardware

Software

Installation

1. Clone the Repository

2. Install Dependencies

3. Install YOLO V8 Dependencies (Recommended)

3b. Download YOLO V3 Weights (Legacy - Optional)

4. Build the Workspace

Configuration

Camera Topics

1. darknet_ros Configuration

2. darknet_ros_3d Configuration

Parameters Explained

Usage

Quick Start

YOLO V8 Usage (Recommended)

Step-by-Step Launch

Visualization in RViz2

ROS 2 Topics

Published Topics

Subscribed Topics

Message Format: BoundingBox3d

Monitoring and Debugging

Check Active Topics

Verify TF Frames

Performance Monitoring

Troubleshooting

Issue: No 3D bounding boxes appearing

Issue: darknet_ros not detecting objects

Issue: "Transform error" messages

Issue: High CPU usage

System Architecture

Supported Object Classes

Performance Tips

Citation

License

Contributing

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages