Skip to content

Commit ac76195

Browse files
committed
Update README.md
1 parent 0e78e2a commit ac76195

File tree

1 file changed

+150
-61
lines changed

1 file changed

+150
-61
lines changed

coffee_ws/src/coffee_vision/README.md

Lines changed: 150 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,67 +1,107 @@
11
# Coffee Vision Package
22

3-
A ROS2 package providing comprehensive computer vision capabilities including camera capture, face detection, and GUI visualization for the Coffee Buddy robot system.
3+
A ROS2 package providing comprehensive computer vision capabilities including camera capture, face detection, and coordinate transformation for the Coffee Buddy robot system.
44

55
## Overview
66

7-
The `coffee_vision` package provides an integrated computer vision solution that combines:
7+
The `coffee_vision` package provides a modular computer vision solution that combines:
88
- Camera capture and streaming
99
- Real-time face detection using OpenCV DNN
10-
- Interactive Qt-based GUI for camera control
1110
- Face position tracking and coordinate transformation
1211
- Multi-threaded performance optimization
1312
- Camera diagnostics and quality controls
13+
- Configurable parameters via ROS2 parameter system
1414

15-
This package contains a single, comprehensive `camera_node` that handles all computer vision tasks with an integrated GUI interface.
15+
This package features a modular architecture with separate components for face detection, coordinate transformation, and camera management, all coordinated by the main `camera_node`.
16+
17+
**Modular Benefits:**
18+
- **Testability**: Face detection and coordinate transformation can be tested independently
19+
- **Reusability**: Components can be used by other packages
20+
- **Maintainability**: Clear separation of concerns makes code easier to understand and modify
21+
- **Configurability**: ROS2 parameters allow runtime tuning without code changes
1622

1723
## Architecture
1824

1925
```
2026
┌─────────────────────────────────────────────────────────────────┐
2127
│ Camera Node │
2228
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
23-
│ │ Qt GUI │ │ FrameGrabber │ │ Face Detection │ │
24-
│ │ - Camera select │ │ - Multi-thread │ │ - OpenCV DNN │ │
25-
│ │ - Quality ctrl │ │ - Frame buffer │ │ - Face tracking │ │
26-
│ │ - Diagnostics │ │ - Rate control │ │ - Smoothing │ │
29+
│ │ FrameGrabber │ │ FaceDetector │ │ CoordinateUtils │ │
30+
│ │ - Multi-thread │ │ - OpenCV DNN │ │ - Transform │ │
31+
│ │ - Frame buffer │ │ - Face tracking │ │ - Eye coords │ │
32+
│ │ - Rate control │ │ - Smoothing │ │ - Mapping │ │
33+
│ │ - ROS publish │ │ - Visualization │ │ - Validation │ │
2734
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
35+
│ │ │ │ │
36+
│ └─────────────────────┼─────────────────────┘ │
37+
│ │ │
2838
└─────────────────────────────────────────────────────────────────┘
29-
30-
31-
ROS2 Publishers:
32-
• /camera_frame (sensor_msgs/Image)
33-
• /face_detection_data (std_msgs/String)
34-
• /vision/face_position (geometry_msgs/Point)
35-
• /vision/face_position_v2 (std_msgs/String)
36-
• /face_images (sensor_msgs/Image)
39+
40+
41+
ROS2 Publishers:
42+
• /coffee_bot/camera/image_raw (sensor_msgs/Image)
43+
• /face_detection_data (std_msgs/String)
44+
• /vision/face_position (geometry_msgs/Point)
45+
• /vision/face_position_v2 (std_msgs/String)
46+
• /face_images (sensor_msgs/Image)
47+
48+
ROS2 Parameters:
49+
• face_confidence_threshold (float)
50+
• face_smoothing_factor (float)
51+
• eye_range (float)
52+
• eye_sensitivity (float)
53+
• invert_x/invert_y (bool)
3754
```
3855

3956
## Components
4057

4158
### Camera Node (`camera_node`)
4259

43-
A comprehensive ROS2 node that provides camera capture, face detection, and GUI control in a single integrated package.
60+
The main ROS2 node that coordinates camera capture, face detection, and coordinate transformation using a modular architecture.
4461

4562
**Key Features:**
46-
- **Multi-threaded Architecture**: Separate threads for capture, processing, publishing, and UI
47-
- **Interactive GUI**: Qt-based interface for camera selection and control
48-
- **Built-in Face Detection**: OpenCV DNN-based face detection with temporal smoothing
63+
- **Multi-threaded Architecture**: Separate threads for capture, processing, and publishing
64+
- **Modular Design**: Separate modules for face detection and coordinate transformation
65+
- **Configurable Parameters**: ROS2 parameters for runtime configuration
4966
- **Performance Optimization**: Adaptive frame rates and quality controls
50-
- **Eye Coordinate Transformation**: Converts face positions to robot eye coordinates
67+
- **ROS Interface**: Comprehensive ROS topic interface for external control
5168

5269
**Publishers:**
53-
- `/camera_frame` (sensor_msgs/Image): Raw camera frames
70+
- `/coffee_bot/camera/image_raw` (sensor_msgs/Image): Raw camera frames
5471
- `/face_detection_data` (std_msgs/String): JSON-formatted face detection results
5572
- `/vision/face_position` (geometry_msgs/Point): Eye-coordinate transformed face position
5673
- `/vision/face_position_v2` (std_msgs/String): Extended face position data
5774
- `/face_images` (sensor_msgs/Image): Extracted face image regions
5875

59-
**GUI Controls:**
60-
- Camera device selection and refresh
61-
- Quality toggle (480p/720p)
62-
- Face detection enable/disable
63-
- Camera diagnostics display
64-
- Real-time video preview with face overlays
76+
### Face Detector (`face_detection.py`)
77+
78+
Standalone face detection module using OpenCV DNN with temporal smoothing.
79+
80+
**Features:**
81+
- **OpenCV DNN Backend**: High-performance face detection
82+
- **Automatic Model Download**: Downloads required models on first run
83+
- **Temporal Smoothing**: Reduces detection flickering
84+
- **CUDA Support**: Automatic GPU acceleration when available
85+
- **Debug Visualization**: Face overlay rendering for debugging
86+
87+
**Key Methods:**
88+
- `detect_faces()`: Core face detection functionality
89+
- `smooth_detections()`: Temporal smoothing algorithm
90+
- `draw_debug_overlay()`: Visualization for debugging
91+
92+
### Coordinate Utilities (`coordinate_utils.py`)
93+
94+
Pure utility functions for coordinate system transformations.
95+
96+
**Features:**
97+
- **Camera to Eye Coordinates**: Transform pixel coordinates to robot eye movement
98+
- **Configurable Sensitivity**: Adjustable sensitivity and inversion parameters
99+
- **Input Validation**: Robust parameter validation
100+
- **No Dependencies**: Pure math functions with no ROS dependencies
101+
102+
**Key Functions:**
103+
- `transform_camera_to_eye_coords()`: Main coordinate transformation
104+
- Input validation and range clamping
65105

66106
## Installation
67107

@@ -90,36 +130,66 @@ source install/setup.bash
90130

91131
### Running the Camera Node
92132

93-
The camera node provides both command-line and GUI interfaces:
133+
The camera node can be run with default settings or custom parameters:
94134

95135
```bash
96-
# Run with GUI (default)
136+
# Run with default settings
97137
ros2 run coffee_vision camera_node
98138

139+
# Run with custom parameters
140+
ros2 run coffee_vision camera_node --ros-args \
141+
-p face_confidence_threshold:=0.7 \
142+
-p eye_range:=1.5 \
143+
-p eye_sensitivity:=2.0
144+
99145
# The node will automatically:
146+
# - Load configuration parameters
100147
# - Scan for available cameras
101-
# - Launch the Qt GUI
102148
# - Start publishing camera data
103149
# - Begin face detection
104150
```
105151

106-
### GUI Interface
152+
### Configuration Parameters
107153

108-
The camera node launches with an interactive GUI that provides:
154+
The node supports the following ROS2 parameters:
109155

110-
1. **Camera Selection**: Dropdown to choose between detected cameras
111-
2. **Quality Control**: Toggle between standard (640x480) and high quality (1280x720)
112-
3. **Face Detection**: Enable/disable real-time face detection
113-
4. **Diagnostics**: View system information and camera capabilities
114-
5. **Live Preview**: Real-time camera feed with face detection overlays
156+
```bash
157+
# Face detection parameters
158+
ros2 param set /camera_node face_confidence_threshold 0.7
159+
ros2 param set /camera_node face_smoothing_factor 0.6
160+
161+
# Eye movement parameters
162+
ros2 param set /camera_node eye_range 1.5
163+
ros2 param set /camera_node eye_sensitivity 2.0
164+
ros2 param set /camera_node invert_x true
165+
166+
# View current parameters
167+
ros2 param list /camera_node
168+
ros2 param get /camera_node face_confidence_threshold
169+
```
170+
171+
### External Control
172+
173+
The camera node provides ROS topic interfaces for external control:
174+
175+
```bash
176+
# Camera control topics
177+
ros2 topic pub /coffee_bot/camera/cmd/select std_msgs/Int32 "data: 1"
178+
ros2 topic pub /coffee_bot/camera/cmd/quality std_msgs/Bool "data: true"
179+
ros2 topic pub /coffee_bot/camera/cmd/face_detection std_msgs/Bool "data: false"
180+
181+
# Status and diagnostics
182+
ros2 topic echo /coffee_bot/camera/status/info
183+
ros2 topic pub /coffee_bot/camera/cmd/diagnostics std_msgs/String "data: 'get'"
184+
```
115185

116186
### ROS2 Topics
117187

118188
Monitor the published data:
119189

120190
```bash
121191
# View camera frames
122-
ros2 topic echo /camera_frame
192+
ros2 topic echo /coffee_bot/camera/image_raw
123193

124194
# Monitor face detection results (JSON format)
125195
ros2 topic echo /face_detection_data
@@ -136,21 +206,34 @@ ros2 topic echo /face_images
136206

137207
## Configuration
138208

209+
### ROS2 Parameters
210+
211+
The node supports comprehensive configuration through ROS2 parameters:
212+
213+
| Parameter | Type | Default | Description |
214+
|-----------|------|---------|-------------|
215+
| `face_confidence_threshold` | float | 0.5 | Minimum confidence for face detection (0.0-1.0) |
216+
| `face_smoothing_factor` | float | 0.4 | Temporal smoothing factor (0.0-1.0, higher = more smoothing) |
217+
| `eye_range` | float | 1.0 | Maximum eye movement range (-eye_range to +eye_range) |
218+
| `eye_sensitivity` | float | 1.5 | Eye movement sensitivity multiplier |
219+
| `invert_x` | bool | false | Invert X axis for eye movement |
220+
| `invert_y` | bool | false | Invert Y axis for eye movement |
221+
139222
### Camera Settings
140223

141224
The node automatically detects and configures cameras with optimal settings:
142225
- **Resolution**: 640x480 (standard) or 1280x720 (high quality)
143226
- **Frame Rate**: 30 FPS (standard) or 24 FPS (high quality)
144227
- **Backend**: Automatically selects V4L2, GStreamer, or OpenCV backends
145228

146-
### Face Detection Parameters
229+
### Face Detection Configuration
147230

148-
Built-in face detection with the following characteristics:
231+
Built-in face detection with configurable parameters:
149232
- **Model**: OpenCV DNN face detector (auto-downloaded)
150-
- **Confidence Threshold**: 0.5 (configurable in code)
233+
- **Confidence Threshold**: Configurable via `face_confidence_threshold` parameter
151234
- **Detection Rate**: Adaptive (3-6 FPS) based on performance
152-
- **Smoothing**: Temporal smoothing to reduce detection jitter
153-
- **Coordinate System**: Transforms to robot eye coordinates (-1.0 to 1.0 range)
235+
- **Smoothing**: Configurable temporal smoothing via `face_smoothing_factor`
236+
- **Coordinate System**: Transforms to robot eye coordinates using configurable sensitivity
154237

155238
### Performance Optimization
156239

@@ -285,26 +368,31 @@ echo $DISPLAY
285368

286369
### Code Structure
287370

288-
- **FrameGrabber**: Core camera capture and processing logic
289-
- **CameraViewer**: Qt GUI interface and controls
290-
- **CameraNode**: ROS2 node wrapper and coordination
291-
- **Face Detection**: OpenCV DNN integration with smoothing
371+
- **CameraNode** (`camera_node.py`): Main ROS2 node with parameter management
372+
- **FrameGrabber** (`camera_node.py`): Core camera capture and processing logic
373+
- **FaceDetector** (`face_detection.py`): Standalone face detection module
374+
- **CoordinateUtils** (`coordinate_utils.py`): Pure coordinate transformation functions
292375

293376
### Extending Functionality
294377

295378
To add new features:
296-
1. **New Publishers**: Add to FrameGrabber class
297-
2. **GUI Controls**: Extend CameraViewer interface
298-
3. **Detection Models**: Update `init_face_detector()` method
299-
4. **Processing Pipeline**: Modify `_process_loop()` thread
379+
1. **New Publishers**: Add to FrameGrabber class in `camera_node.py`
380+
2. **Face Detection**: Extend FaceDetector class in `face_detection.py`
381+
3. **Coordinate Systems**: Add functions to `coordinate_utils.py`
382+
4. **Parameters**: Add new ROS2 parameters in CameraNode
383+
5. **Processing Pipeline**: Modify `_process_loop()` thread in FrameGrabber
300384

301385
### Performance Tuning
302386

303-
Key parameters for optimization:
304-
- `min_detection_interval`: Face detection frequency
305-
- `detection_skip_frames`: Frame skipping for performance
306-
- `smoothing_factor`: Temporal smoothing strength
307-
- `face_confidence_threshold`: Detection sensitivity
387+
Key configurable parameters for optimization:
388+
- `face_confidence_threshold`: Detection sensitivity (lower = more detections)
389+
- `face_smoothing_factor`: Temporal smoothing strength (higher = smoother)
390+
- `eye_sensitivity`: Eye movement responsiveness
391+
- `eye_range`: Maximum eye movement range
392+
393+
Built-in adaptive parameters:
394+
- `min_detection_interval`: Face detection frequency (adaptive)
395+
- `detection_skip_frames`: Frame skipping for performance (adaptive)
308396

309397
## Testing UI Separation
310398

@@ -351,17 +439,18 @@ The UI displays real-time performance metrics to evaluate ROS transport vs integ
351439
## Known Limitations
352440

353441
1. **Single Camera**: Currently supports one camera at a time
354-
2. **Qt Dependency**: Requires GUI environment (not headless friendly)
355-
3. **Memory Usage**: High memory consumption due to image processing
356-
4. **CPU Intensive**: Face detection requires significant computational resources
442+
2. **Memory Usage**: High memory consumption due to image processing
443+
3. **CPU Intensive**: Face detection requires significant computational resources
444+
4. **Synchronous Model Download**: Face detection models downloaded synchronously at startup
357445

358446
## Future Improvements
359447

360448
Potential enhancements:
361-
- Headless mode without GUI
362449
- Multiple camera support
450+
- Asynchronous model downloading
363451
- Advanced face tracking algorithms
364452
- Custom face detection models
453+
- Camera reconnection and recovery
365454
- WebRTC streaming capabilities
366455

367456
## License

0 commit comments

Comments
 (0)