Skip to content

Commit b8489fb

Browse files
Configure TFLite Micro Development Environment
1 parent a04a168 commit b8489fb

File tree

2 files changed

+283
-0
lines changed

2 files changed

+283
-0
lines changed
Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
# Configure TFLite Micro Development Environment
2+
3+
Before developing TensorFlow Lite for Microcontrollers (TFLite Micro) applications on the openvela platform, the compilation environment and dependent libraries must be configured correctly. This section guides developers through source code confirmation, library dependency configuration, and memory strategy formulation.
4+
5+
## I. Prerequisites
6+
7+
Before starting, please ensure that the following preparations have been completed:
8+
9+
- **Basic Environment**: Refer to the [Official Documentation](../quickstart/openvela_ubuntu_quick_start.md) to complete the deployment of the openvela basic development environment.
10+
11+
- **Source Code Confirmation**: The TFLite Micro source code has been integrated into the openvela code repository at the following path:
12+
13+
- `apps/mlearning/tflite-micro/`
14+
15+
## II. Component and Dependency Library Support
16+
17+
TFLite Micro relies on specific mathematical and utility libraries to implement model parsing and operator acceleration. The openvela repository has pre-configured the following key components:
18+
19+
| **Component Name** | **Functional Description** | **Source Path** |
20+
| :----------------- | :--------------------------------------------------------------------------------------------------------------- | :------------------------- |
21+
| **FlatBuffers** | Library supporting the TFLite model serialization format; provides necessary headers. | `apps/system/flatbuffers/` |
22+
| **Gemmlowp** | Google's low-precision general matrix multiplication library, used for quantized operations. | `apps/math/gemmlowp/` |
23+
| **Ruy** | TensorFlow's high-performance matrix multiplication backend, mainly optimizing fully connected layer operations. | `apps/math/ruy/` |
24+
| **KissFFT** | Lightweight Fast Fourier Transform library, supporting fixed-point and floating-point operations. | `apps/math/kissfft/` |
25+
| **CMSIS-NN** | Neural network kernel optimization library dedicated to ARM Cortex-M (optional). | `apps/mlearning/cmsis-nn/` |
26+
27+
## III. Compilation Configuration (Kconfig)
28+
29+
Enable necessary library support through the `menuconfig` graphical interface to ensure successful compilation and optimize code size.
30+
31+
Launch the configuration menu:
32+
33+
```Bash
34+
cmake --build cmake_out/goldfish-arm64-v8a-ap -t menuconfig
35+
```
36+
37+
Please complete the configuration of the following four core modules in order:
38+
39+
### 1. Enable C++ Runtime Support
40+
41+
TFLite Micro is written based on C++11/14 standards; therefore, LLVM libc++ support must be enabled.
42+
43+
- **Configuration Path**: `Library Routines` -> `C++ Library`
44+
- **Action**: Select `LLVM libc++ C++ Standard Library`
45+
46+
```Plain
47+
(Top) → Library Routines → C++ Library
48+
49+
( ) Toolchain C++ support
50+
( ) Basic C++ support
51+
(X) LLVM libc++ C++ Standard Library
52+
```
53+
54+
### 2. Enable Math Acceleration Libraries
55+
56+
Enable matrix operation and signal processing libraries based on model requirements.
57+
58+
- **Configuration Path**: `Application Configuration` -> `Math Library Support`
59+
- **Action**: Select `Gemmlowp`, `kissfft`, and `Ruy`
60+
61+
```Plain
62+
(Top) → Application Configuration → Math Library Support
63+
64+
[*] Gemmlowp
65+
[*] kissfft
66+
[ ] LibTomMath MPI Math Library
67+
[*] Ruy
68+
```
69+
70+
### 3. Enable FlatBuffers Support
71+
72+
Enable the system-level FlatBuffers library to support model parsing.
73+
74+
- **Configuration Path**: `Application Configuration` -> `System Libraries and NSH Add-Ons`
75+
- **Action**: Select `flatbuffers`
76+
77+
```Plain
78+
(Top) → Application Configuration → System Libraries and NSH Add-Ons
79+
80+
[*] flatbuffers
81+
```
82+
83+
### 4. Enable TFLite Micro Core
84+
85+
- **Configuration Path**: `Application Configuration` -> `Machine Learning Support`
86+
- **Action**: Select `TFLiteMicro`. If ARM hardware acceleration is required, it is recommended to also select `CMSIS_NN Library`.
87+
88+
```Plain
89+
(Top) → Application Configuration → Machine Learning Support
90+
91+
[ ] CMSIS_NN Library
92+
[*] TFLiteMicro
93+
[ ] Print tflite-micro's debug message
94+
```
95+
96+
## IV. Memory Allocation Strategy
97+
98+
Embedded systems have limited memory resources. TFLite Micro requires a continuous memory area (Tensor Arena) to store input/output tensors and intermediate calculation results.
99+
100+
### 1. Static Allocation (Recommended)
101+
102+
For production environments, static array allocation is recommended. This method eliminates the risk of memory fragmentation, and memory usage is known at compile time.
103+
104+
**Implementation Example**
105+
106+
```C++
107+
// Define in the global area of the application code
108+
// Note: Memory must be aligned to 16 bytes to meet SIMD instruction requirements
109+
#define TENSOR_ARENA_SIZE (100 * 1024)
110+
static uint8_t tensor_arena[TENSOR_ARENA_SIZE] __attribute__((aligned(16)));
111+
```
112+
113+
### 2. Determine Arena Size
114+
115+
To precisely set `TENSOR_ARENA_SIZE` and avoid waste or overflow, you can use `RecordingMicroInterpreter` to capture actual memory usage at runtime.
116+
117+
**Debugging Steps**:
118+
119+
1. Include the recorder header file.
120+
2. Use `RecordingMicroInterpreter` to replace the standard `MicroInterpreter`.
121+
3. Run model inference once (Invoke).
122+
4. Read the actual usage and add a safety margin (suggest adding +1KB).
123+
124+
```C++
125+
#include "tensorflow/lite/micro/recording_micro_interpreter.h"
126+
127+
// 1. Create recording allocator
128+
auto* allocator = tflite::RecordingMicroAllocator::Create(tensor_arena, arena_size);
129+
130+
// 2. Instantiate recording interpreter
131+
tflite::RecordingMicroInterpreter interpreter(model, resolver, allocator);
132+
133+
// 3. Allocate tensors and execute inference
134+
interpreter.AllocateTensors();
135+
interpreter.Invoke();
136+
137+
// 4. Get memory statistics
138+
size_t used = interpreter.arena_used_bytes(); // Actual usage
139+
interpreter.GetMicroAllocator().PrintAllocations(); // Itemized details
140+
size_t recommended = used + 1024; // Reserve at least ~1KB extra space
141+
```
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
# 配置 TFLite Micro 开发环境
2+
3+
[ [English](../../en/edge_ai_dev/configure_tflite_micro_dev_env.md) | 简体中文 ]
4+
5+
在 openvela 平台上开发 TensorFlow Lite for Microcontrollers (TFLite Micro) 应用前,必须正确配置编译环境与依赖库。本节指导开发者完成源码确认、库依赖配置及内存策略制定。
6+
7+
## 一、先决条件
8+
9+
在开始之前,请确保已完成以下准备工作:
10+
11+
- **基础环境**:参考[官方文档](../quickstart/openvela_ubuntu_quick_start.md),完成 openvela 基础开发环境的部署。
12+
- **源码确认**:TFLite Micro 源码已集成至 openvela 代码仓库中,路径为:
13+
14+
- `apps/mlearning/tflite-micro/`
15+
16+
## 二、组件与依赖库支持
17+
18+
TFLite Micro 依赖特定的数学库和工具库来实现模型解析与算子加速。openvela 仓库已预置以下关键组件:
19+
20+
| **组件名称** | **功能描述** | **源码路径** |
21+
| :-------------- | :------------------------------------------------------ | :------------------------- |
22+
| **FlatBuffers** | TFLite 模型序列化格式支持库,提供必要的头文件。 | `apps/system/flatbuffers/` |
23+
| **Gemmlowp** | Google 提供的低精度通用矩阵乘法库,用于量化运算。 | `apps/math/gemmlowp/` |
24+
| **Ruy** | TensorFlow 的高性能矩阵乘法后端,主要优化全连接层运算。 | `apps/math/ruy/` |
25+
| **KissFFT** | 轻量级快速傅里叶变换库,支持定点与浮点运算。 | `apps/math/kissfft/` |
26+
| **CMSIS-NN** | ARM Cortex-M 专用神经网络内核优化库(可选)。 | `apps/mlearning/cmsis-nn/` |
27+
28+
## 三、编译配置 (Kconfig)
29+
30+
通过 menuconfig 图形化界面启用必要的库支持,以确保编译通过并优化代码体积。
31+
32+
启动配置菜单
33+
34+
```Bash
35+
cmake --build cmake_out/goldfish-arm64-v8a-ap -t menuconfig
36+
```
37+
38+
请依次完成以下四个核心模块的配置:
39+
40+
### 1、启用 C++ 运行时支持
41+
42+
TFLite Micro 基于 C++11/14 标准编写,必须启用 LLVM libc++ 支持。
43+
44+
- **配置路径**`Library Routines` -> `C++ Library`
45+
- **操作**:选择 `LLVM libc++ C++ Standard Library`
46+
47+
```Plain
48+
(Top) → Library Routines → C++ Library
49+
50+
( ) Toolchain C++ support
51+
( ) Basic C++ support
52+
(X) LLVM libc++ C++ Standard Library
53+
```
54+
55+
### 2、启用数学加速库
56+
57+
根据模型需求启用矩阵运算与信号处理库。
58+
59+
- **配置路径**`Application Configuration` -> `Math Library Support`
60+
- **操作**:选中 `Gemmlowp`, `kissfft`, `Ruy`
61+
62+
```Plain
63+
(Top) → Application Configuration → Math Library Support
64+
65+
[*] Gemmlowp
66+
[*] kissfft
67+
[ ] LibTomMath MPI Math Library
68+
[*] Ruy
69+
```
70+
71+
### 3、启用 FlatBuffers 支持
72+
73+
启用系统级 FlatBuffers 库以支持模型解析。
74+
75+
- **配置路径**`Application Configuration` -> `System Libraries and NSH Add-Ons`
76+
- **操作**:选中 `flatbuffers`
77+
78+
```Plain
79+
(Top) → Application Configuration → System Libraries and NSH Add-Ons
80+
81+
[*] flatbuffers
82+
```
83+
84+
### 4、启用 TFLite Micro 核心
85+
86+
- **配置路径**`Application Configuration` -> `Machine Learning Support`
87+
- **操作**:选中 `TFLiteMicro`。如需使用 ARM 硬件加速,建议同时选中 `CMSIS_NN Library`
88+
89+
```Plain
90+
(Top) → Application Configuration → Machine Learning Support
91+
92+
[ ] CMSIS_NN Library
93+
[*] TFLiteMicro
94+
[ ] Print tflite-micro's debug message
95+
```
96+
97+
## 四、内存分配策略
98+
99+
嵌入式系统的内存资源有限,TFLite Micro 需要一块连续的内存区域(Tensor Arena)来存放输入/输出张量及中间计算结果。
100+
101+
### 1、静态分配(推荐)
102+
103+
对于生产环境,推荐使用静态数组分配。这种方式无内存碎片风险,且内存占用在编译期可知。
104+
105+
**实现示例**
106+
107+
```C++
108+
// 在应用代码全局区域定义
109+
// 注意:内存必须按照 16 字节对齐,以满足 SIMD 指令要求
110+
#define TENSOR_ARENA_SIZE (100 * 1024)
111+
static uint8_t tensor_arena[TENSOR_ARENA_SIZE] __attribute__((aligned(16)));
112+
```
113+
114+
### 2、确定 Arena 大小
115+
116+
为了精准设定 `TENSOR_ARENA_SIZE`,避免浪费或溢出,可以使用 `RecordingMicroInterpreter` 在运行时抓取实际内存用量。
117+
118+
**调试步骤**:
119+
120+
1. 引入记录器头文件。
121+
2. 使用 `RecordingMicroInterpreter` 替换标准的 `MicroInterpreter`。
122+
3. 运行一次模型推理(Invoke)。
123+
4. 读取实际使用量并添加安全冗余(建议 +1KB)。
124+
125+
```C++
126+
#include "tensorflow/lite/micro/recording_micro_interpreter.h"
127+
128+
// 1. 创建记录分配器
129+
auto* allocator = tflite::RecordingMicroAllocator::Create(tensor_arena, arena_size);
130+
131+
// 2. 实例化记录解释器
132+
tflite::RecordingMicroInterpreter interpreter(model, resolver, allocator);
133+
134+
// 3. 分配张量并执行推理
135+
interpreter.AllocateTensors();
136+
interpreter.Invoke();
137+
138+
// 4. 获取内存统计信息
139+
size_t used = interpreter.arena_used_bytes(); // 实际占用
140+
interpreter.GetMicroAllocator().PrintAllocations(); // 分项明细
141+
size_t recommended = used + 1024; // 至少额外预留 ~1KB 空间
142+
```

0 commit comments

Comments
 (0)