Transform Apple Watch IMU data into realistic 3D hand animations using deep learning
This project converts IMU sensor data from Apple Watch into realistic 3D hand mesh animations using the MANO hand model. Our deep learning approach captures natural hand movements and finger articulation from wrist-worn inertial sensors.
- π± Data Collection: Apple Watch records accelerometer and gyroscope data
- π§ Neural Network: Enhanced LSTM processes 7 IMU features (acc_x, acc_y, acc_z, rot_w, rot_x, rot_y, rot_z)
- β Hand Generation: Outputs MANO parameters for realistic 3D hand mesh
- π¬ Animation: Creates smooth hand animations with quantile-based spike removal
- Real-time inference from Apple Watch IMU data
- Per-finger accuracy with dedicated joint angle losses
- Quantile-based smoothing for natural hand movements
- Both hands supported (left/right hand detection)
- Professional video output with multiple camera angles
Our model is trained on 2+ hours of hand gesture data collected from 30+ participants using Apple Watch, ensuring robust generalization across different users and hand movements.
Download our pre-trained model: π₯ Pre-trained Weights
The model achieves state-of-the-art performance on IMU-to-hand pose estimation with enhanced finger articulation accuracy.
# Clone repository
git clone https://github.com/yourusername/ArcaneHand.git
cd ArcaneHand
# Install dependencies
pip install torch torchvision numpy pandas scipy matplotlib imageio tqdm
pip install smplx # For MANO hand model
# Download MANO model (required)
# Get MANO_RIGHT.pkl from: https://mano.is.tue.mpg.de/
# Place in project root directoryDownload best_enhanced_finger_model.pth and place it in the project root directory.
Your IMU data should be in CSV format with the following structure:
time,acc_x,acc_y,acc_z,rot_w,rot_x,rot_y,rot_z
0.000,-0.123,0.456,9.789,0.999,0.001,-0.002,0.003
0.020,-0.125,0.458,9.791,0.998,0.002,-0.001,0.004
0.040,-0.127,0.460,9.793,0.997,0.003,0.000,0.005
...Required columns:
time: Timestamp (optional, for reference)acc_x,acc_y,acc_z: Linear acceleration in m/sΒ²rot_w,rot_x,rot_y,rot_z: Quaternion components (normalized)
python inference.py --input data/my_gesture.csv --output results/ --hand rightpython inference.py \
--input data/my_gesture.csv \
--output results/ \
--hand right \
--smooth \
--video \
--fps 30from inference import IMUToHandPredictor
# Initialize predictor
predictor = IMUToHandPredictor(
model_path='best_enhanced_finger_model.pth',
mano_path='MANO_RIGHT.pkl'
)
# Load and process your data
results = predictor.predict_from_csv(
'data/my_gesture.csv',
hand_type='right',
apply_smoothing=True
)
# Generate video
predictor.create_video(
results['poses'],
'output/hand_animation.mp4',
fps=30
)ArcaneHand/
βββ README.md # This file
βββ inference.py # Main inference script π₯
βββ model_utils.py # Model utilities π₯
βββ requirements.txt # Dependencies
βββ best_enhanced_finger_model.pth # Pre-trained model (download)
βββ MANO_RIGHT.pkl # MANO model (download)
βββ data/ # Your IMU data
β βββ example_gesture.csv # Example data format
β βββ my_gesture.csv # Your data
βββ results/ # Output directory
β βββ poses/ # Generated poses
β βββ meshes/ # 3D meshes
β βββ videos/ # Generated videos
βββ examples/ # Example scripts
βββ basic_usage.py
βββ advanced_usage.py
Your CSV file must contain exactly these 7 columns (in any order):
| Column | Description | Units | Range |
|---|---|---|---|
acc_x |
X-axis acceleration | m/sΒ² | -50 to +50 |
acc_y |
Y-axis acceleration | m/sΒ² | -50 to +50 |
acc_z |
Z-axis acceleration | m/sΒ² | -50 to +50 |
rot_w |
Quaternion W component | unitless | -1 to +1 |
rot_x |
Quaternion X component | unitless | -1 to +1 |
rot_y |
Quaternion Y component | unitless | -1 to +1 |
rot_z |
Quaternion Z component | unitless | -1 to +1 |
time,acc_x,acc_y,acc_z,rot_w,rot_x,rot_y,rot_z
0.000,-0.123,0.456,9.789,0.999,0.001,-0.002,0.003
0.020,-0.125,0.458,9.791,0.998,0.002,-0.001,0.004
0.040,-0.127,0.460,9.793,0.997,0.003,0.000,0.005
0.060,-0.129,0.462,9.795,0.996,0.004,0.001,0.006
0.080,-0.131,0.464,9.797,0.995,0.005,0.002,0.007If your data comes from Apple Watch or other sources, you may need to:
- Unit Conversion: Convert accelerometer from g-force to m/sΒ² (multiply by 9.80665)
- Quaternion Normalization: Ensure quaternions have unit magnitude
- Sampling Rate: Resample to 30-50 Hz for best results
- Filtering: Apply gentle low-pass filter (optional)
# Example preprocessing
import pandas as pd
import numpy as np
def preprocess_apple_watch_data(csv_file):
df = pd.read_csv(csv_file)
# Convert g to m/sΒ² (if needed)
if df['acc_x'].abs().max() < 20: # Likely in g units
df['acc_x'] *= 9.80665
df['acc_y'] *= 9.80665
df['acc_z'] *= 9.80665
# Normalize quaternions
quat_cols = ['rot_w', 'rot_x', 'rot_y', 'rot_z']
quat_mag = np.sqrt((df[quat_cols] ** 2).sum(axis=1))
df[quat_cols] = df[quat_cols].div(quat_mag, axis=0)
return dfThe inference script generates multiple outputs:
- File:
results/poses/gesture_poses.npy - Format: NumPy array [frames, 48] containing MANO parameters
- File:
results/meshes/gesture_vertices.npy - Format: NumPy array [frames, 778, 3] containing 3D hand vertices
- File:
results/videos/gesture_animation.mp4 - Format: MP4 video showing 3D hand animation
- File:
results/processing_report.json - Contains: Processing statistics, smoothing info, quality metrics
python inference.py [OPTIONS]
Required:
--input PATH Input CSV file with IMU data
--output PATH Output directory for results
Optional:
--hand {left,right} Hand type (default: right)
--model PATH Path to model file (default: best_enhanced_finger_model.pth)
--mano PATH Path to MANO model (default: MANO_RIGHT.pkl)
--smooth Apply quantile-based smoothing
--video Generate video output
--fps INT Video frame rate (default: 30)
--format {mp4,gif} Video format (default: mp4)
--verbose Verbose outputfrom inference import IMUToHandPredictor
import glob
predictor = IMUToHandPredictor()
# Process multiple files
csv_files = glob.glob('data/*.csv')
for csv_file in csv_files:
results = predictor.predict_from_csv(csv_file)
predictor.save_results(results, f'results/{Path(csv_file).stem}/')# Load with custom settings
predictor = IMUToHandPredictor(
model_path='custom_model.pth',
device='cuda',
smoothing_config={
'spike_percentile': 85, # Remove top 15% spikes
'window_size': 7,
'min_threshold': 0.005
}
)- Sampling Rate: 30-50 Hz works best
- Sequence Length: 100-1000 frames optimal
- Data Quality: Clean data = better results
- Smoothing: Enable for noisy sensor data
- GPU: Use CUDA for faster processing
We welcome contributions! Please see our Contributing Guide.
@article{ArcaneHand_2025,
title={Real-time Hand Pose Estimation from Apple Watch IMU Data},
author={Faraz Rabbani},
year={2025}
}MIT License - see LICENSE file.
