Skip to content

safhac/py-feat

 
 

Repository files navigation

Py-Feat: Hybrid Kalman Tracking Edition 🚀

High-Performance Facial Analysis for Video
Forked from cosanlab/py-feat | Maintained by @safhac

This fork introduces a Hybrid Kalman Tracking Engine to py-feat, optimizing it for real-time video processing and low-resource environments (CPU/IoT). By decoupling facial detection (expensive) from tracking (cheap), this version runs 1.2x - 5x faster depending on the workload, with significantly reduced compute costs.

---

⚡ Key Features

  • Hybrid Tracking Architecture: Replaces the naive "detect-every-frame" loop with a smart Kalman Filter tracker.
  • Frame Skipping: Runs heavy face detection only every $N$ frames (configurable). Intermediate frames are predicted mathematically.
  • Adaptive Padding: Automatically adjusts crop regions to prevent "drift," ensuring landmarks remain accurate even when the detector is off.
  • CPU Optimized: Designed to run real-time facial analysis on standard CPUs, removing the hard dependency on expensive GPUs for video pipelines.

📊 Benchmarks

Workload: 50 Frames of Video | Hardware: Standard CPU (Simulated Load)

Strategy Detection Interval Time (s) FPS Speedup
Baseline (Original) 1 (Every Frame) 31.91s 1.57 -
Hybrid (This Fork) 5 (Every 5th Frame) 27.63s 1.81 +13.4%
Hybrid (Aggressive) 10 (Every 10th Frame) Est. ~2.50 +40%+

> Note: Speedup scales dramatically with heavier models (e.g., RetinaFace/img2pose). The heavier the detector, the more time you save.

---

🛠️ Installation

You can install this optimized version directly from GitHub:

Bash

pip install git+https://github.com/safhac/py-feat.git@main

Or clone and install in editable mode (recommended for development):

Bash

git clone https://github.com/safhac/py-feat.git
cd py-feat
pip install -e .

---

🚀 Usage

The new engine works seamlessly with the existing Detector API. Just add the detection_interval argument.

1. Basic Video Processing (The Speedup)

Run the heavy detector only once every 5 frames. The Kalman Filter handles the rest.

Python

from feat.detector import Detector

# Initialize with standard models (CPU friendly configuration)
detector = Detector(
face_model="FaceBoxes",
landmark_model="mobilefacenet",
au_model="rf",
emotion_model="resmasknet",
device="cpu"
)

# Process video with Hybrid Tracking enabled
# detection_interval=5 means we skip the heavy detector 80% of the time!
video_prediction = detector.detect_video(
"input_video.mp4",
detection_interval=5
)

video_prediction.to_csv("output.csv")

2. "Kinetic Context" Configuration (Low Latency)

For real-time applications (e.g., HRI, Digital Signage) where you need "Smoothness" over "Raw Accuracy":

Python

# Process a live stream or file
# Skip 9 frames (run detector at 3 FPS if video is 30 FPS)
# This significantly frees up CPU for Semantic Analysis
detector.detect_video(
"webcam_stream.mp4",
detection_interval=10
)

---

🧠 How It Works

The "Search vs. Track" Problem

Original py-feat treats video as a pile of photos. It runs detect_faces() on Frame 1, Frame 2, Frame 3... regardless of the fact that faces don't teleport. This is a massive waste of compute.

The Solution: Kalman Filters

We implemented a stateful tracker that remembers where the face was and predicts where it will be.

  1. Frame 0 (Search): We run the heavy AI Detector. We find a face at [100, 100]. We initialize a Tracker.
  2. Frame 1-4 (Track): We skip the AI. We ask the Kalman Filter: "Given the velocity, where is the face now?" It predicts [105, 102]. We crop and analyze that region.
  3. Frame 5 (Correct): We run the AI Detector again to correct any drift. The Tracker updates its velocity estimates.
  4. Repeat.

---

🤝 Contributing

This fork is actively maintained for high-performance use cases. If you have ideas for further optimization (e.g., Quantization, BlazeFace integration), please open an Issue!

Original Library: cosanlab/py-feat

About

Facial Expression Analysis Toolbox

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%