Skip to content

Commit 1be392c

Browse files
committed
add face emotion recognition support
1 parent e9417cb commit 1be392c

File tree

11 files changed

+399
-1
lines changed

11 files changed

+399
-1
lines changed
23.8 KB
Loading
23.1 KB
Loading

docs/doc/en/sidebar.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,8 @@ items:
6565
label: Face multi landmarks detection
6666
- file: vision/face_recognition.md
6767
label: Face recognition
68+
- file: vision/face_emotion.md
69+
label: Face emotion
6870
- file: vision/body_key_points.md
6971
label: Human critical point detection
7072
- file: vision/segmentation.md

docs/doc/en/vision/face_emotion.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
---
2+
title: MaixCAM MaixPy Facial Expression Recognition, Gender, Mask, Age, and More
3+
update:
4+
- date: 2025-01-10
5+
version: v1.0
6+
author: neucrack
7+
content: Added source code, documentation, and examples for facial emotion recognition.
8+
---
9+
10+
## Introduction
11+
12+
In the previous articles, [Facial Detection and Keypoint Detection](./face_detection.md) and [Facial Multi-Keypoint Detection], we introduced how to detect faces, keypoints, and facial recognition. This article focuses on recognizing facial emotions (expressions). It also explores how to identify other characteristics, such as gender, mask-wearing, and age.
13+
14+
![](../../assets/face_emotion_happy.jpg) ![](../../assets/face_emotion_neutral.jpg)
15+
16+
Demonstration video on MaixCAM:
17+
<video playsinline controls autoplay loop muted preload src="/static/video/maixcam_face_emotion.mp4" type="video/mp4">
18+
Classifier Result video
19+
</video>
20+
21+
> Video source: [oarriaga/face_classification](https://github.com/oarriaga/face_classification)
22+
23+
## Using Facial Emotion Recognition in MaixCAM MaixPy
24+
25+
MaixPy provides a default emotion recognition model with seven categories:
26+
* angry
27+
* disgust
28+
* fear
29+
* happy
30+
* sad
31+
* surprise
32+
* neutral
33+
34+
The process for emotion recognition involves several steps:
35+
1. Detect the face.
36+
2. Crop the face into a standard format, as shown in the small image in the top-left corner above.
37+
3. Classify the cropped face image using a simple model.
38+
39+
In MaixPy, the `yolov8-face` model is used for detecting facial and eye positions, followed by classification. Below is the code, which is also available in the [MaixPy](https://github.com/sipeed/maixpy) `examples` directory:
40+
41+
```python
42+
from maix import camera, display, image, nn, app
43+
44+
detect_conf_th = 0.5
45+
detect_iou_th = 0.45
46+
emotion_conf_th = 0.5
47+
max_face_num = -1
48+
crop_scale = 1.2
49+
50+
# detect face model
51+
detector = nn.YOLOv8(model="/root/models/yolov8n_face.mud", dual_buff=False)
52+
# landmarks detector for cropping images
53+
landmarks_detector = nn.FaceLandmarks(model="")
54+
# emotion classify model
55+
classifier = nn.Classifier(model="/root/models/face_emotion.mud", dual_buff=False)
56+
57+
cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format())
58+
disp = display.Display()
59+
60+
# for drawing result info
61+
max_labels_length = 0
62+
for label in classifier.labels:
63+
size = image.string_size(label)
64+
if size.width() > max_labels_length:
65+
max_labels_length = size.width()
66+
67+
max_score_length = cam.width() / 4
68+
69+
while not app.need_exit():
70+
img = cam.read()
71+
results = []
72+
objs = detector.detect(img, conf_th=detect_conf_th, iou_th=detect_iou_th, sort=1)
73+
count = 0
74+
idxes = []
75+
img_std_first: image.Image = None
76+
for i, obj in enumerate(objs):
77+
img_std = landmarks_detector.crop_image(img, obj.x, obj.y, obj.w, obj.h, obj.points,
78+
classifier.input_width(), classifier.input_height(), crop_scale)
79+
if img_std:
80+
img_std_gray = img_std.to_format(image.Format.FMT_GRAYSCALE)
81+
res = classifier.classify(img_std_gray, softmax=True)
82+
results.append(res)
83+
idxes.append(i)
84+
if i == 0:
85+
img_std_first = img_std
86+
count += 1
87+
if max_face_num > 0 and count >= max_face_num:
88+
break
89+
for i, res in enumerate(results):
90+
if i == 0:
91+
img.draw_image(0, 0, img_std_first)
92+
for j in range(len(classifier.labels)):
93+
idx = res[j][0]
94+
score = res[j][1]
95+
img.draw_string(0, img_std_first.height() + idx * 16, classifier.labels[idx], image.COLOR_WHITE)
96+
img.draw_rect(max_labels_length, int(img_std_first.height() + idx * 16), int(score * max_score_length), 8, image.COLOR_GREEN if score >= emotion_conf_th else image.COLOR_RED, -1)
97+
img.draw_string(int(max_labels_length + score * max_score_length + 2), int(img_std_first.height() + idx * 16), f"{score:.1f}", image.COLOR_RED)
98+
color = image.COLOR_GREEN if res[0][1] >= emotion_conf_th else image.COLOR_RED
99+
obj = objs[idxes[i]]
100+
img.draw_rect(obj.x, obj.y, obj.w, obj.h, color, 1)
101+
img.draw_string(obj.x, obj.y, f"{classifier.labels[res[0][0]]}: {res[0][1]:.1f}", color)
102+
disp.show(img)
103+
```
104+
105+
### Key Code
106+
107+
The core code steps are as follows:
108+
```python
109+
objs = detector.detect(img, conf_th=detect_conf_th, iou_th=detect_iou_th, sort=1)
110+
img_std = landmarks_detector.crop_image(...)
111+
img_std_gray = img_std.to_format(image.Format.FMT_GRAYSCALE)
112+
res = classifier.classify(img_std_gray, softmax=True)
113+
```
114+
115+
These correspond to:
116+
1. Detect the face.
117+
2. Crop the face.
118+
3. Classify the face image using a model (convert to grayscale before input as required).
119+
120+
## Improving Recognition Accuracy
121+
122+
The default MaixPy model offers basic classification but can be optimized by:
123+
* **Using keypoints as model input:** Instead of cropped images, facial keypoints can be used for input, removing background interference and improving training accuracy.
124+
* **Enhancing datasets:** Increase the number and variety of samples.
125+
* **Improving cropping techniques:** Use advanced transformations for precise cropping, such as affine transformations commonly used in facial recognition.
126+
127+
## Training a Custom Classification Model
128+
129+
### Overview
130+
1. **Define categories:** E.g., 7 emotions, gender, mask detection, etc.
131+
2. **Choose a model:** Lightweight classification models like MobileNetV2 work well.
132+
3. **Select a training platform:**
133+
* Use [MaixHub](https://maixhub.com) for online training (**recommended**).
134+
* Alternatively, train locally using PyTorch or TensorFlow.
135+
4. **Collect data:** Modify the code to save captured images, e.g., `img.save("/root/image0.jpg")`.
136+
5. **Clean data:** Organize samples into labeled folders.
137+
6. **Train:**
138+
* On MaixHub for an easy-to-deploy model.
139+
* Locally, then convert the model to ONNX and to MUD format for MaixPy.
140+
141+
## Recognizing Other Facial Features (Gender, Mask, Age, etc.)
142+
143+
The same principles apply to features like gender or mask detection. For numerical outputs like age, consider using regression models. Research online for more advanced techniques.
144+
145+
146+

docs/doc/zh/sidebar.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,8 @@ items:
6565
label: 人脸多关键点检测
6666
- file: vision/face_recognition.md
6767
label: 人脸识别
68+
- file: vision/face_emotion.md
69+
label: 人脸表情情绪识别
6870
- file: vision/body_key_points.md
6971
label: 人体关键点检测
7072
- file: vision/segmentation.md

docs/doc/zh/vision/face_emotion.md

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
---
2+
title: MaixCAM MaixPy 人脸表情情绪识别、性别、口罩,年龄等识别
3+
update:
4+
- date: 2025-01-010
5+
version: v1.0
6+
author: neucrack
7+
content: 增加人脸情绪识别源码、文档、例程
8+
---
9+
10+
## 简介
11+
12+
前面的文章[人脸检测和少量关键点检测](./face_detection.md)[人脸多个关键点]中介绍了如何检测人脸,以及关键点,以及人脸识别,本文介绍如何识别人脸情绪(表情)。
13+
以及介绍如何实现识别其它特征,比如性别、是否戴口罩、年龄等等。
14+
15+
16+
![](../../assets/face_emotion_happy.jpg) ![](../../assets/face_emotion_neutral.jpg)
17+
18+
在 MaixCAM 上的效果视频:
19+
<video playsinline controls autoplay loop muted preload src="/static/video/maixcam_face_emotion.mp4" type="video/mp4">
20+
Classifier Result video
21+
</video>
22+
23+
> 视频素材来自 [oarriaga/face_classification](https://github.com/oarriaga/face_classification)
24+
25+
26+
## 在 MaixCAM MaixPy 中使用人脸表情(情绪)识别
27+
28+
MaixPy 默认提供的情绪识别有 7 个分类,包括:
29+
* angry: 生气
30+
* disgust: 恶心
31+
* fear: 害怕
32+
* happy: 高兴
33+
* sad: 悲伤
34+
* surprise: 惊讶
35+
* neutral: 自然状态
36+
37+
情绪识别分了几个步骤:
38+
* 检测人脸。
39+
* 将人脸裁切出来变成一个比较标准的人脸图,如上面图中左上角小图。
40+
* 将小图使用一个简单的分类模型进行分类。
41+
42+
在MaixPy 中,先使用`yolov8-face` 模型进行人脸和眼睛的位置检测,然后再进行分类,代码如下,完整代码也可以在[MaixPy](https://github.com/sipeed/maixpy) `examples`目录中找到:
43+
```python
44+
from maix import camera, display, image, nn, app
45+
46+
detect_conf_th = 0.5
47+
detect_iou_th = 0.45
48+
emotion_conf_th = 0.5
49+
max_face_num = -1
50+
crop_scale = 1.2
51+
52+
# detect face model
53+
detector = nn.YOLOv8(model="/root/models/yolov8n_face.mud", dual_buff = False)
54+
# we only use one of it's function to crop face from image, wo we not init model actually
55+
landmarks_detector = nn.FaceLandmarks(model="")
56+
# emotion classify model
57+
classifier = nn.Classifier(model="/root/models/face_emotion.mud", dual_buff=False)
58+
59+
cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format())
60+
disp = display.Display()
61+
62+
# for draw result info
63+
max_labels_length = 0
64+
for label in classifier.labels:
65+
size = image.string_size(label)
66+
if size.width() > max_labels_length:
67+
max_labels_length = size.width()
68+
69+
max_score_length = cam.width() / 4
70+
71+
while not app.need_exit():
72+
img = cam.read()
73+
results = []
74+
objs = detector.detect(img, conf_th = detect_conf_th, iou_th = detect_iou_th, sort = 1)
75+
count = 0
76+
idxes = []
77+
img_std_first : image.Image = None
78+
for i, obj in enumerate(objs):
79+
img_std = landmarks_detector.crop_image(img, obj.x, obj.y, obj.w, obj.h, obj.points,
80+
classifier.input_width(), classifier.input_height(), crop_scale)
81+
if img_std:
82+
img_std_gray = img_std.to_format(image.Format.FMT_GRAYSCALE)
83+
res = classifier.classify(img_std_gray, softmax=True)
84+
results.append(res)
85+
idxes.append(i)
86+
if i == 0:
87+
img_std_first = img_std
88+
count += 1
89+
if max_face_num > 0 and count >= max_face_num:
90+
break
91+
for i, res in enumerate(results):
92+
# draw fisrt face detailed info
93+
if i == 0:
94+
img.draw_image(0, 0, img_std_first)
95+
for j in range(len(classifier.labels)):
96+
idx = res[j][0]
97+
score = res[j][1]
98+
img.draw_string(0, img_std_first.height() + idx * 16, classifier.labels[idx], image.COLOR_WHITE)
99+
img.draw_rect(max_labels_length, int(img_std_first.height() + idx * 16), int(score * max_score_length), 8, image.COLOR_GREEN if score >= emotion_conf_th else image.COLOR_RED, -1)
100+
img.draw_string(int(max_labels_length + score * max_score_length + 2), int(img_std_first.height() + idx * 16), f"{score:.1f}", image.COLOR_RED)
101+
# draw on all face
102+
color = image.COLOR_GREEN if res[0][1] >= emotion_conf_th else image.COLOR_RED
103+
obj = objs[idxes[i]]
104+
img.draw_rect(obj.x, obj.y, obj.w, obj.h, color, 1)
105+
img.draw_string(obj.x, obj.y, f"{classifier.labels[res[0][0]]}: {res[0][1]:.1f}", color)
106+
disp.show(img)
107+
```
108+
109+
可以看到,这里核心代码就是:
110+
```python
111+
objs = detector.detect(img, conf_th = detect_conf_th, iou_th = detect_iou_th, sort = 1)
112+
img_std = landmarks_detector.crop_image(...)
113+
img_std_gray = img_std.to_format(image.Format.FMT_GRAYSCALE)
114+
res = classifier.classify(img_std_gray, softmax=True)
115+
```
116+
分别对应了上面讲的:
117+
* 找人脸。
118+
* 裁切人脸。
119+
* 使用分类模型预测类别(输入用了灰度图像输入,所以先转为灰度图)。
120+
121+
## 优化识别精确度
122+
123+
MaixPy 默认提供了一个 7 分类的模型,是基于图片输入的分类,为了得到精确度更好的识别,以及更适合你你可以从以下方面优化模型:
124+
* 用关键点作为分类模型的输入: 除了使用小图,也可以不用图像作为输入,可以前面文章中检测到的人脸关键点作为分类模型的输入,这样去掉了背景的影响,模型更容易训练,理论上精度更高。
125+
* 优化数据集,增加样本量。
126+
* 优化裁切图的步骤:这里裁切小图用了比较简单的变换,借用了`landmarks_detector.crop_image` 函数,利用人脸的两只眼睛的位置进行图像旋转和裁切。你也可以用更精准的变换算法讲脸变换到固定位置,比如人脸识别中使用的放射变换等。
127+
128+
## 自定义分类训练模型
129+
130+
这里只讲输入为图像的方式,为关键点数据的请自行琢磨。
131+
详细步骤:
132+
* 确定分类类别:比如上面的 7 个分类,或者识别性别、是否戴口罩等等。
133+
* 确定模型:分类一般使用一个很小的分类模型即可,用几个卷积搭建的模型就可以,也可以用现成的比如 MobilenetV2 等模型,根据自己的精度要求和运行时间要求选择,建议直接用 Mobilenet 试试,跑通再尝试其它的。
134+
* 确定训练平台:
135+
* 可以直接使用 [MaixHub](https://maixhub.com) 进行在线训练,创建分类项目,这种方式好处是无需搭建环境和写代码,一键训练生成模型(**推荐**)。
136+
* 也可以自己在本地搭建 pytorch 或者 tensorflow 环境,自行搜索 mobilenet 分类模型训练教程。
137+
* 采集数据:直接基于上面的代码修改为采集程序,比如把摄像头读取到的`img`以及裁切过后的`img_std`标准人脸图像都采集保存到文件系统(使用`img.save("/root/image0.jpg")`类似的方法),然后传输到电脑备用。
138+
* 其它数据集:当然你也可以从网上找数据,最好是用 MaixPy 识别一遍将标准图像裁切出来保存。
139+
* 数据清洗:检查一下数据中是否有不正确的数据,进行整理,每个类别放一个文件夹下。
140+
* 训练:
141+
* 在 MaixHub 上传数据进行训练:会得到一个包含模型文件的压缩包,直接是 MaixPy 支持的格式。
142+
* 离线训练:训练完成后需要转换成 onnx 模型格式,然后按照[模型转换为MUD文件](../ai_model_converter/maixcam.md) 进行模型转换,安装环境会比较麻烦。
143+
* 运行:替换例程中的分类模型即可。
144+
145+
## 识别其它面部特征,比如性别、是否戴口罩、年龄等
146+
147+
如上面所说,原理和训练情绪识别一样,用一个分类模型,使用不同数据即可,识别年龄这种数值需要使用回归模型,可以自行网上搜索学习。
148+

docs/pages/index/README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -379,6 +379,15 @@ MaixVision
379379
<div>
380380
</div>
381381
</div>
382+
<div class="feature_item">
383+
<div class="img_video">
384+
<video playsinline controls autoplay loop muted preload src="/static/video/maixcam_face_landmarks.mp4"></video>
385+
<p class="feature">AI 人脸关键点</p>
386+
<p class="description">检测人脸关键点,面部特征/动作识别,AI 换脸</p>
387+
</div>
388+
<div>
389+
</div>
390+
</div>
382391
<div class="feature_item">
383392
<div class="img_video">
384393
<img src="/static/image/body_keypoint.jpg">
@@ -388,6 +397,15 @@ MaixVision
388397
<div>
389398
</div>
390399
</div>
400+
<div class="feature_item">
401+
<div class="img_video">
402+
<video playsinline controls autoplay loop muted preload src="/static/video/hands_landmarks.mp4"></video>
403+
<p class="feature">AI 手部关键点</p>
404+
<p class="description">检测手部关键点,手势识别</p>
405+
</div>
406+
<div>
407+
</div>
408+
</div>
391409
<div class="feature_item">
392410
<div class="img_video">
393411
<img src="/static/image/self_learn_classifier.jpg">

docs/pages/index_en/README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -379,6 +379,15 @@ You can create new features using the rich API provided by MaixPy.
379379
<div>
380380
</div>
381381
</div>
382+
<div class="feature_item">
383+
<div class="img_video">
384+
<video playsinline controls autoplay loop muted preload src="/static/video/maixcam_face_landmarks.mp4"></video>
385+
<p class="feature">AI Face Landmarks</p>
386+
<p class="description">Detect face landmarks, replace face</p>
387+
</div>
388+
<div>
389+
</div>
390+
</div>
382391
<div class="feature_item">
383392
<div class="img_video">
384393
<img src="/static/image/body_keypoint.jpg">
@@ -388,6 +397,15 @@ You can create new features using the rich API provided by MaixPy.
388397
<div>
389398
</div>
390399
</div>
400+
<div class="feature_item">
401+
<div class="img_video">
402+
<video playsinline controls autoplay loop muted preload src="/static/video/hands_landmarks.mp4"></video>
403+
<p class="feature">AI Hand keypoints</p>
404+
<p class="description">Detect hand keypoints and recognize gesture</p>
405+
</div>
406+
<div>
407+
</div>
408+
</div>
391409
<div class="feature_item">
392410
<div class="img_video">
393411
<img src="/static/image/self_learn_classifier.jpg">
1.24 MB
Binary file not shown.

0 commit comments

Comments
 (0)