An Android application for Human-Robot Interaction research that provides comprehensive human behavior analysis through Pose Detection, Action Recognition, and Emotion Analysis using Google ML Kit and custom object detection models.
Real-time human pose estimation with exercise classification and repetition counting.
Supported Poses (7 types):
standing- Standing upright positionsit- Sitting positionlie- Lying down positionpushup_up- Push-up upper positionpushup_down- Push-up lower positionsquat_up- Squat upper positionsquat_down- Squat lower position
Exercise Recognition:
- Push-ups: Automatic detection and repetition counting
- Squats: Automatic detection and repetition counting
- Real-time form analysis with confidence scoring
- Audio feedback on successful repetitions
Object detection-based recognition of daily activities with high accuracy filtering.
Supported Actions (2 types):
reading- Reading books or documentsdrinking- Drinking water or beverages
Recognition Features:
- Confidence threshold: 80% for reliable detection
- Sliding window processing (50 frames) for stability
- 5-second cooldown to prevent duplicate detections
- Background state handling with
nothingclassification
Real-time facial emotion recognition with continuous monitoring and event detection.
Emotion Detection:
- Continuous emotion state tracking
- Event-based emotion change alerts
- Integration with facial landmark detection
- Timestamp logging for emotion events
Camera Input โ Parallel Processing โ HRI Analysis โ Real-time Output
โ โ โ โ
Live Stream โ [Face Detection] โ [Emotion] โ UI Display
โ [Pose Detection] โ [Exercise] โ Event Triggers
โ [Object Detection] โ [Action] โ Data Logging
โโโ LivePreviewActivity # Main HRI interface
โโโ viewmanager/
โ โโโ KETIDetector # Central processing coordinator
โ โโโ CameraSource # Camera stream management
โ โโโ GraphicOverlay # Visual overlay system
โ โโโ KETIDetectorConstants # System constants and modes
โโโ facedetector/
โ โโโ FaceDetectorProcessor # Face detection processing
โ โโโ FaceClassifierProcessor # Emotion classification
โโโ posedetector/
โ โโโ PoseDetectorProcessor # Pose detection & analysis
โ โโโ PoseClassifierProcessor # Exercise classification
โโโ objectdetector/
โโโ ObjectDetectorProcessor # Action recognition
- Camera Input โ Real-time video stream capture
- Parallel ML Processing โ Simultaneous face, pose, and object detection
- HRI Analysis โ KETIDetector coordinates and processes all detection results
- Sliding Window Filtering โ Stabilizes recognition results over time
- Event Generation โ Triggers for emotion changes, exercise reps, and actions
- Output โ Real-time UI updates and data logging
- Android Studio Arctic Fox or later
- Android device with API 26+ (Android 8.0+)
- Camera permission for real-time detection
- Minimum 4GB RAM for optimal ML processing performance
-
Clone the repository
git clone https://github.com/your-username/keti-hri-android.git cd keti-hri-android -
Open in Android Studio
- Import the project
- Sync Gradle dependencies
- Build and run on device
// ML Kit for core functionality
implementation 'com.google.mlkit:face-detection:16.1.5'
implementation 'com.google.mlkit:pose-detection:18.0.0-beta3'
implementation 'com.google.mlkit:pose-detection-accurate:18.0.0-beta3'
implementation 'com.google.mlkit:object-detection:17.0.0'
implementation 'com.google.mlkit:object-detection-custom:17.0.0'
// Camera framework
implementation "androidx.camera:camera-camera2:1.0.0-SNAPSHOT"
implementation "androidx.camera:camera-lifecycle:1.0.0-SNAPSHOT"- Emotion Regular: Continuous emotion monitoring (1-second intervals)
- Emotion Event: Event-triggered emotion change alerts
- Pose Regular: General pose classification (standing, sitting, lying)
- Exercise Mode: Exercise recognition with repetition counting
- Human Detection: Icon changes color when human is detected
- Real-time Results: Live display of current pose, emotion, and action
- Timestamps: When each detection/event occurs
- Camera Toggle: Switch between front/rear camera
// Pose classification results
"standing : 0.95 confidence"
"sit : 0.87 confidence"
// Exercise repetition counting
"pushup_down : 5 reps"
"squat_down : 12 reps"// Action detection with high confidence
"reading : 0.85 confidence"
"drinking : 0.92 confidence"// Emotion states with timestamps
"happy : 14:30:25"
"surprised : 14:30:47"// Pose detection configuration
PoseDetectorOptions poseOptions = new PoseDetectorOptions.Builder()
.setDetectorMode(PoseDetectorOptions.STREAM_MODE)
.build();
// Face detection for emotion analysis
FaceDetectorOptions faceOptions = new FaceDetectorOptions.Builder()
.setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_FAST)
.setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)
.build();
// Object detection for actions
ObjectDetectorOptions objectOptions = new ObjectDetectorOptions.Builder()
.setDetectorMode(ObjectDetectorOptions.STREAM_MODE)
.enableClassification()
.build();// Sliding window sizes for stability
FACE_DETECTOR_SLIDING_WINDOW_SIZE = 50;
ACTION_DETECTOR_SLIDING_WINDOW_SIZE = 50;
POSE_DETECTOR_SLIDING_WINDOW_SIZE = 20;
// Confidence thresholds
ACTION_CONFIDENCE_THRESHOLD = 0.8f;
POSE_CONFIDENCE_THRESHOLD = 0.5f;
// Event cooldown timers
ACTION_EVENT_COOLDOWN = 5; // seconds- Android API 26+ (Android 8.0 and above)
- Target SDK: API 31 (Android 12)
- Architecture: ARM64, ARMv7
- Detection Latency: <100ms for real-time processing
- Frame Rate: 30 FPS camera input
- Memory Usage: Optimized for mobile devices with 4GB+ RAM
- Face Detection: ML Kit lightweight model
- Pose Detection: 33-point BlazePose model
- Action Recognition: Custom trained object detection model
- Training Data:
- Pose samples:
pose_all.csv(7 pose types) - Exercise samples:
exercise_pose.csv(push-up/squat variations)
- Pose samples:
Central processing coordinator that handles:
- Multi-modal detection result integration
- Sliding window filtering for stability
- Event generation and timing control
- Real-time data processing and output
- FaceDetectorProcessor: Emotion analysis from facial features
- PoseDetectorProcessor: Pose classification and exercise recognition
- ObjectDetectorProcessor: Action recognition from object detection
// Pose exercise result structure
class PoseExercise {
String exercise; // Exercise type (pushup/squat)
int repetition; // Current repetition count
String pose; // Current pose classification
float score; // Confidence score
}- Adjust confidence thresholds for different accuracy requirements
- Modify sliding window sizes for responsiveness vs. stability
- Add new pose classifications by updating CSV training data
- Extend action recognition with additional object classes
# Format: image_name, pose_class, landmark_coordinates...
image1.jpg,standing,x1,y1,z1,x2,y2,z2,...
image2.jpg,sit,x1,y1,z1,x2,y2,z2,...- Basic Poses:
standing,sit,lie - Exercise Poses:
pushup_up,pushup_down,squat_up,squat_down
- Fork the repository
- Create a feature branch (
git checkout -b feature/new-feature) - Follow existing code style and architecture patterns
- Test thoroughly on physical devices
- Submit a pull request with detailed description
- Maintain real-time performance requirements
- Follow Android development best practices
- Ensure proper camera resource management
- Add appropriate error handling and logging
- Update training data when adding new recognition classes
This project is released under a custom license inspired by the MIT License. See LICENSE file for details.
Use of this codeโcommercial or non-commercial, including academic research, model training, product integration, and distributionโrequires prior written permission from the author. Unauthorized usage will be treated as a license violation.
Taehyeon Kim, Ph.D.
Senior Researcher, Korea Electronics Technology Institute (KETI)
๐ง [email protected] ๐ Homepage
This work was supported by Korea Evaluation Institute of Industrial Technology(KEIT) grant funded by the Korea government(MOTIE). (No 20009760. Development of human friendly multipurpose service robot and new market creation by applying human robot interaction design)
Pose Detection โข Action Recognition โข Emotion Analysis
Developed by KETI for Advanced Human-Robot Interaction Research

