Skip to content

Commit de91d6d

Browse files
committed
feat(ai): add esp_dl human activity recognition example
1 parent ce1e2dd commit de91d6d

File tree

19 files changed

+1070
-0
lines changed

19 files changed

+1070
-0
lines changed

.gitlab/ci/build.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,14 @@
5959
- IMAGE: espressif/idf:release-v5.2
6060
- IMAGE: espressif/idf:release-v5.3
6161

62+
build_example_ai_esp_dl_human_activity_recognition:
63+
extends:
64+
- .build_examples_template
65+
- .rules:build:example_ai_esp_dl_human_activity_recognition
66+
variables:
67+
IMAGE: espressif/idf:release-v5.3
68+
EXAMPLE_DIR: examples/ai/esp_dl/human_activity_recognition
69+
6270
build_example_audio_wav_player:
6371
extends:
6472
- .build_examples_template

.gitlab/ci/rules.yml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,9 @@
388388
- "components/usb/usb_stream/include/usb_stream.h"
389389
- "components/zero_detection/include/zero_detection.h"
390390
# examples folder, in the alphabetic order
391+
.patterns-example_ai_esp_dl_human_activity_recognition: &patterns-example_ai_esp_dl_human_activity_recognition
392+
- "examples/ai/esp_dl/human_activity_recognition/**/*"
393+
391394
.patterns-example_audio_wav_player: &patterns-example_audio_wav_player
392395
- "examples/audio/**/*"
393396

@@ -648,6 +651,16 @@
648651
if: '$BOT_LABEL_DEPLOY || $CI_MERGE_REQUEST_LABELS =~ /^(?:[^,\n\r]+,)*deploy(?:,[^,\n\r]+)*$/i'
649652

650653
# rules for examples
654+
.rules:build:example_ai_esp_dl_human_activity_recognition:
655+
rules:
656+
- <<: *if-protected
657+
- <<: *if-label-build
658+
- <<: *if-trigger-job
659+
- <<: *if-dev-push
660+
changes: *patterns-build_system
661+
- <<: *if-dev-push
662+
changes: *patterns-example_ai_esp_dl_human_activity_recognition
663+
651664
.rules:build:example_audio_wav_player:
652665
rules:
653666
- <<: *if-protected

examples/.build-rules.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33
# * A == 1 or (B == 2 and C in [1,2,3])
44
# * (A == 1 and B == 2) or (C not in ["3", "4", 5])
55

6+
examples/ai/esp_dl/human_activity_recognition:
7+
enable:
8+
- if: IDF_TARGET in ["esp32s3","esp32p4"] and (IDF_VERSION_MAJOR == 5 and IDF_VERSION_MINOR == 3)
9+
610
examples/audio/wav_player:
711
enable:
812
- if: INCLUDE_DEFAULT == 1
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# The following five lines of boilerplate have to be in your project's
2+
# CMakeLists in this exact order for cmake to work correctly
3+
cmake_minimum_required(VERSION 3.16)
4+
5+
include($ENV{IDF_PATH}/tools/cmake/project.cmake)
6+
project(human_activity_recognition)
Lines changed: 300 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,300 @@
1+
# Human Activity Recognition Based on ESP-DL
2+
3+
This is a human activity recognition model built using the [ESP-DL](https://github.com/espressif/esp-dl) framework, leveraging the [Human Activity Recognition with Smartphones Dataset](https://www.kaggle.com/datasets/uciml/human-activity-recognition-with-smartphones). The dataset was collected from 30 volunteers (aged 19-48) performing six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING) while carrying a waist-mounted smartphone (Samsung Galaxy S II). Data was recorded at 50Hz using the device's accelerometer and gyroscope. The dataset is manually labeled via video recordings and split into training (70%) and test (30%) sets.
4+
5+
Please note that the data in the [Human Activity Recognition with Smartphones Dataset](https://www.kaggle.com/datasets/uciml/human-activity-recognition-with-smartphones) is not raw data, but preprocessed feature vectors with a length of 561. The class distribution in the training set is as follows:
6+
7+
![](https://dl.espressif.com/AE/esp-iot-solution/HAR_train_distribution.png)
8+
9+
In this example, we demonstrate how to train the model, quantize it, and deploy it to the ESP32 platform.
10+
11+
## How to deploy model
12+
13+
### Build and train model
14+
15+
This example uses [PyTorch](https://pytorch.org/) to build a model, and the model architecture is:
16+
17+
```python
18+
class HARModel(nn.Module):
19+
def __init__(self):
20+
super(HARModel, self).__init__()
21+
22+
self.model = nn.Sequential(
23+
nn.Linear(561, 256),
24+
nn.ReLU(),
25+
nn.Linear(256, 128),
26+
nn.ReLU(),
27+
nn.Linear(128, 6)
28+
)
29+
30+
def forward(self, x):
31+
output = self.model(x)
32+
return output
33+
```
34+
> **Note:** In the model design, 561 represents the number of input features, while 6 denotes the number of output classes, corresponding to the different activity categories in the human activity recognition task.
35+
36+
Next, the data needs to be preprocessed:
37+
38+
```python
39+
mean = X_train.mean(dim=0, keepdim=True)
40+
std = X_train.std(dim=0, keepdim=True)
41+
X_train = (X_train - mean) / std
42+
X_test = (X_test - mean) / std
43+
44+
train_dataset = TensorDataset(X_train, y_train)
45+
test_dataset = TensorDataset(X_test, y_test)
46+
47+
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
48+
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
49+
```
50+
51+
After that, we train the model using the Adam optimizer and the cross-entropy loss function:
52+
53+
```python
54+
model = HARModel().to(DEVICE)
55+
56+
criterion = torch.nn.CrossEntropyLoss()
57+
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
58+
59+
num_epochs = 100
60+
61+
for epoch in range(num_epochs):
62+
model.train()
63+
running_loss = 0.0
64+
for inputs, labels in train_loader:
65+
inputs, labels = inputs.to(DEVICE), labels.to(DEVICE)
66+
outputs = model(inputs)
67+
loss = criterion(outputs, labels)
68+
optimizer.zero_grad()
69+
loss.backward()
70+
optimizer.step()
71+
72+
running_loss += loss.item()
73+
74+
avg_train_loss = running_loss / len(train_loader)
75+
print(f"Epoch [{epoch + 1}/{num_epochs}], Loss: {avg_train_loss:.4f}")
76+
77+
model.eval()
78+
correct = 0
79+
total = 0
80+
with torch.no_grad():
81+
for test_inputs, test_labels in test_loader:
82+
test_inputs, test_labels = test_inputs.to(DEVICE), test_labels.to(DEVICE)
83+
outputs = model(test_inputs)
84+
_, predicted = torch.max(outputs.data, 1)
85+
total += test_labels.size(0)
86+
correct += (predicted == test_labels).sum().item()
87+
88+
accuracy = 100 * correct / total
89+
print(f"Accuracy of the model on the validation data: {accuracy:.2f}%")
90+
```
91+
92+
### Model Quantization and Deployment
93+
94+
In order to convert the model to the format required by ``esp-dl``, the [esp-ppq](https://github.com/espressif/esp-ppq) conversion tool needs to be installed:
95+
96+
```shell
97+
pip uninstall ppq
98+
pip install git+https://github.com/espressif/esp-ppq.git
99+
```
100+
101+
Next, use the ``esp-ppq`` tool to quantize and convert the model; the conversion process can be referenced as follows:
102+
103+
```python
104+
BATCH_SIZE = 1
105+
DEVICE = "cpu"
106+
TARGET = "esp32p4"
107+
NUM_OF_BITS = 8
108+
input_shape = [561]
109+
ESPDL_MODLE_PATH = "./p4/har.espdl"
110+
111+
class HARFeature(Dataset):
112+
def __init__(self, features):
113+
self.features = features
114+
115+
def __len__(self):
116+
return len(self.features)
117+
118+
def __getitem__(self, idx):
119+
return self.features[idx]
120+
121+
def collate_fn2(batch):
122+
return batch.to(DEVICE)
123+
124+
if __name__ == '__main__':
125+
x_test, y_test = load_and_preprocess_data(
126+
"../dataset/train.csv",
127+
"../dataset/test.csv")
128+
129+
test_input = x_test[0]
130+
test_input = test_input.unsqueeze(0)
131+
132+
model = HARModel()
133+
model.load_state_dict(torch.load("final_model.pth", map_location="cpu"))
134+
model.eval()
135+
136+
har_dataset = HARFeature(x_test)
137+
har_dataloader = DataLoader(har_dataset, batch_size=BATCH_SIZE, shuffle=False)
138+
139+
quant_ppq_graph = espdl_quantize_torch(
140+
model=model,
141+
espdl_export_file=ESPDL_MODLE_PATH,
142+
calib_dataloader=har_dataloader,
143+
calib_steps=8,
144+
input_shape=[1] + input_shape,
145+
inputs=[test_input],
146+
target=TARGET,
147+
num_of_bits=NUM_OF_BITS,
148+
collate_fn=collate_fn2,
149+
device=DEVICE,
150+
error_report=True,
151+
skip_export=False,
152+
export_test_values=True,
153+
verbose=1,
154+
dispatching_override=None
155+
)
156+
```
157+
158+
During the conversion process, we applied 8-bit quantization to reduce the model's storage requirements and improve inference speed.
159+
160+
The output of the conversion process is as follows:
161+
162+
```shell
163+
[INFO][ESPDL][2025-02-10 20:07:01]: Calibration dataset samples: 2947, len(Calibrate iter): 2947
164+
[20:07:01] PPQ Quantize Simplify Pass Running ... Finished.
165+
[20:07:01] PPQ Quantization Fusion Pass Running ... Finished.
166+
[20:07:01] PPQ Parameter Quantization Pass Running ... Finished.
167+
Calibration Progress(Phase 1): 100%|██████████| 8/8 [00:00<00:00, 615.37it/s]
168+
Calibration Progress(Phase 2): 100%|██████████| 8/8 [00:00<00:00, 799.98it/s]
169+
Finished.
170+
[20:07:01] PPQ Passive Parameter Quantization Running ... Finished.
171+
[20:07:01] PPQ Quantization Alignment Pass Running ... Finished.
172+
[INFO][ESPDL][2025-02-10 20:07:01]: --------- Network Snapshot ---------
173+
Num of Op: [5]
174+
Num of Quantized Op: [5]
175+
Num of Variable: [12]
176+
Num of Quantized Var: [12]
177+
------- Quantization Snapshot ------
178+
Num of Quant Config: [16]
179+
ACTIVATED: [7]
180+
OVERLAPPED: [6]
181+
PASSIVE: [3]
182+
183+
[INFO][ESPDL][2025-02-10 20:07:01]: Network Quantization Finished.
184+
Analysing Graphwise Quantization Error(Phrase 1):: 100%|██████████| 8/8 [00:00<00:00, 235.29it/s]
185+
Analysing Graphwise Quantization Error(Phrase 2):: 100%|██████████| 8/8 [00:00<00:00, 166.67it/s]
186+
Analysing Layerwise quantization error:: 0%| | 0/3 [00:00<?, ?it/s]Layer | NOISE:SIGNAL POWER RATIO
187+
/model/model.0/Gemm: | ████████████████████ | 0.012%
188+
/model/model.2/Gemm: | ███████ | 0.009%
189+
/model/model.4/Gemm: | | 0.007%
190+
Analysing Layerwise quantization error:: 100%|██████████| 3/3 [00:00<00:00, 88.23it/s]
191+
Layer | NOISE:SIGNAL POWER RATIO
192+
/model/model.4/Gemm: | ████████████████████ | 0.003%
193+
/model/model.0/Gemm: | █████████████████ | 0.003%
194+
/model/model.2/Gemm: | | 0.000%
195+
196+
Process finished with exit code 0
197+
```
198+
199+
Please note that esp-ppq supports adding test data during the export of the espdl model. You can place the data in a list and pass it to the ``inputs`` parameter of ``espdl_quantize_torch``. This is important for troubleshooting model inference issues.
200+
201+
You can check the model's inference results in the exported info file:
202+
203+
```shell
204+
test outputs value:
205+
%11, shape: [1, 6], exponents: [-2],
206+
value: array([-32, -41, 82, -54, -51, -51], dtype=int8)
207+
```
208+
209+
210+
### Load model in ESP Platform
211+
212+
You can load models from ``rodata``, ``partition``, or ``sdcard``. In this example, the ``partition`` method is used; for more details, please refer to [how_to_load_model](https://github.com/espressif/esp-dl/blob/master/docs/en/tutorials/how_to_load_model.rst).
213+
214+
Additionally, to ensure the model outputs the expected results, you need to preprocess the data following the inference process on the PC and parse the model's inference results. You can refer to the data preprocessing and post-processing steps in this example.
215+
216+
## Example output
217+
218+
In this example, we selected the 0th, 100th, and 500th samples from x_test as test data for the ESP-DL model. The corresponding ground truth labels are ``STANDING``, ``WALKING``, and ``SITTING``, respectively.
219+
220+
```shell
221+
I (25) boot: ESP-IDF v5.5-dev-1610-g9cabe79385 2nd stage bootloader
222+
I (26) boot: compile time Feb 10 2025 16:00:45
223+
I (26) boot: Multicore bootloader
224+
I (29) boot: chip revision: v0.2
225+
I (30) boot: efuse block revision: v0.1
226+
I (34) qio_mode: Enabling default flash chip QIO
227+
I (38) boot.esp32p4: SPI Speed : 80MHz
228+
I (42) boot.esp32p4: SPI Mode : QIO
229+
I (46) boot.esp32p4: SPI Flash Size : 16MB
230+
I (50) boot: Enabling RNG early entropy source...
231+
I (54) boot: Partition Table:
232+
I (57) boot: ## Label Usage Type ST Offset Length
233+
I (63) boot: 0 factory factory app 00 00 00010000 007d0000
234+
I (70) boot: 1 model Unknown data 01 82 007e0000 007b7000
235+
I (77) boot: End of partition table
236+
I (80) esp_image: segment 0: paddr=00010020 vaddr=480f0020 size=16730h ( 91952) map
237+
I (102) esp_image: segment 1: paddr=00026758 vaddr=30100000 size=00068h ( 104) load
238+
I (104) esp_image: segment 2: paddr=000267c8 vaddr=4ff00000 size=09850h ( 38992) load
239+
I (114) esp_image: segment 3: paddr=00030020 vaddr=48000020 size=e16e4h (923364) map
240+
I (257) esp_image: segment 4: paddr=0011170c vaddr=4ff09850 size=094c4h ( 38084) load
241+
I (266) esp_image: segment 5: paddr=0011abd8 vaddr=4ff12d80 size=0337ch ( 13180) load
242+
I (274) boot: Loaded app from partition at offset 0x10000
243+
I (274) boot: Disabling RNG early entropy source...
244+
I (284) hex_psram: vendor id : 0x0d (AP)
245+
I (284) hex_psram: Latency : 0x01 (Fixed)
246+
I (285) hex_psram: DriveStr. : 0x00 (25 Ohm)
247+
I (285) hex_psram: dev id : 0x03 (generation 4)
248+
I (290) hex_psram: density : 0x07 (256 Mbit)
249+
I (295) hex_psram: good-die : 0x06 (Pass)
250+
I (299) hex_psram: SRF : 0x02 (Slow Refresh)
251+
I (303) hex_psram: BurstType : 0x00 ( Wrap)
252+
I (308) hex_psram: BurstLen : 0x03 (2048 Byte)
253+
I (312) hex_psram: BitMode : 0x01 (X16 Mode)
254+
I (316) hex_psram: Readlatency : 0x04 (14 cycles@Fixed)
255+
I (321) hex_psram: DriveStrength: 0x00 (1/1)
256+
I (326) MSPI DQS: tuning success, best phase id is 2
257+
I (508) MSPI DQS: tuning success, best delayline id is 11
258+
I esp_psram: Found 32MB PSRAM device
259+
I esp_psram: Speed: 200MHz
260+
I (515) mmu_psram: .rodata xip on psram
261+
I (560) mmu_psram: .text xip on psram
262+
I (561) hex_psram: psram CS IO is dedicated
263+
I (561) cpu_start: Multicore app
264+
I (1054) esp_psram: SPI SRAM memory test OK
265+
I (1064) cpu_start: Pro cpu start user code
266+
I (1064) cpu_start: cpu freq: 360000000 Hz
267+
I (1064) app_init: Application information:
268+
I (1064) app_init: Project name: human_activity_recognition
269+
I (1070) app_init: App version: ca0d8a28-dirty
270+
I (1074) app_init: Compile time: Feb 10 2025 20:12:38
271+
I (1079) app_init: ELF file SHA256: 257da75e2...
272+
I (1084) app_init: ESP-IDF: v5.5-dev-1610-g9cabe79385
273+
I (1089) efuse_init: Min chip rev: v0.1
274+
I (1093) efuse_init: Max chip rev: v1.99
275+
I (1097) efuse_init: Chip rev: v0.2
276+
I (1101) heap_init: Initializing. RAM available for dynamic allocation:
277+
I (1108) heap_init: At 4FF192E0 len 00021CE0 (135 KiB): RAM
278+
I (1113) heap_init: At 4FF3AFC0 len 00004BF0 (18 KiB): RAM
279+
I (1118) heap_init: At 4FF40000 len 00040000 (256 KiB): RAM
280+
I (1123) heap_init: At 30100068 len 00001F98 (7 KiB): TCM
281+
I (1129) esp_psram: Adding pool of 31552K of PSRAM memory to heap allocator
282+
I (1135) spi_flash: detected chip: generic
283+
I (1139) spi_flash: flash io: qio
284+
I (1143) main_task: Started on CPU0
285+
I (1179) esp_psram: Reserving pool of 32K of internal memory for DMA/internal allocations
286+
I (1179) main_task: Calling app_main()
287+
I (1180) FbsLoader: The storage free size is 65536 KB
288+
I (1184) FbsLoader: The partition size is 7900 KB
289+
I (1189) dl::Model: model:main_graph, version:0
290+
291+
I (1193) dl::Model: /model/model.0/Gemm: Gemm
292+
I (1203) dl::Model: /model/model.2/Gemm: Gemm
293+
I (1204) dl::Model: /model/model.4/Gemm: Gemm
294+
I (1206) MemoryManagerGreedy: Maximum memory size: 832
295+
296+
I (343) HAR: Test case 0: Predict result: STANDING
297+
I (346) HAR: Test case 1: Predict result: WALKING
298+
I (349) HAR: Test case 2: Predict result: SITTING
299+
I (354) main_task: Returned from app_main()
300+
```

0 commit comments

Comments
 (0)