Skip to content

Commit 92f4c14

Browse files
author
Thong Phan
committed
samples: tflite-micro: micro_speech: add README
Micro-speech application on i.MX8MP HiFi4 DSP with OpenAMP communication. Documentation includes: - Audio contract (16kHz, S16_LE, 20ms frames) - Build and usage instructions - Integration with Linux userspace application - Sample output and known limitations The sample detects speech commands (yes/no/silence/unknown) using a 20KB neural network, processing audio data sent via RPMsg from Cortex-A cores. Tested on i.MX8MP EVK with HiFi4 DSP as remote processor. Signed-off-by: Thong Phan <[email protected]>
1 parent 74ce0bd commit 92f4c14

File tree

1 file changed

+154
-0
lines changed
  • samples/modules/tflite-micro/micro_speech

1 file changed

+154
-0
lines changed
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
.. zephyr:code-sample:: tflite-micro-speech-openamp
2+
:name: Micro Speech OpenAMP
3+
4+
Recognize speech commands from audio input received on Cortex-A cores and
5+
processed on the HiFi4 DSP of the i.MX8M Plus EVK board using TensorFlow Lite
6+
for Microcontrollers with a 20KB neural network.
7+
8+
Overview
9+
********
10+
11+
This sample requires an application running on the Cortex-A cores of the i.MX8M Plus
12+
to capture audio and send it to the HiFi4 DSP using OpenAMP. The DSP processes
13+
the audio data and performs inference using TensorFlow Lite Micro that
14+
detects 2 speech commands ("yes" and "no"), as well as "silence" and "unknown".
15+
16+
.. code-block:: text
17+
18+
+------------------------- Cortex A (main core) -------------+ +--------------- HiFi4 DSP (remote core) --------------+
19+
| | | |
20+
| [ALSA/arecord] -> [Linux userspace] -> [/dev/ttyRPMSG*] |----------> [RPMsg] ------->| [ring/msgq] -> [frontend] -> [TFLM] -> [output] |
21+
| | | |
22+
+------------------------------------------------------------+ +------------------------------------------------------+
23+
24+
.. Note::
25+
This README and sample have been modified from
26+
`the TensorFlow Hello World sample`_,
27+
`the OpenAMP using resource table from Zephyr`_ and
28+
`the Micro Speech Example from TensorFlow Lite for Microcontrollers`_.
29+
30+
.. _the TensorFlow Hello World sample:
31+
https://github.com/tensorflow/tflite-micro-arduino-examples/tree/main/examples/hello_world
32+
33+
.. _the OpenAMP using resource table from Zephyr:
34+
https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/subsys/ipc/openamp_rsc_table
35+
36+
.. _the Micro Speech Example from TensorFlow Lite for Microcontrollers:
37+
https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/micro_speech
38+
39+
Audio contract
40+
--------------
41+
- Sample rate: 16kHz
42+
- Sample format: S16_LE
43+
- Frame size (samples per RPMsg payload): 20ms (320 samples or 640 bytes)
44+
- Endianness: LE
45+
46+
Compatibility
47+
-------------
48+
- Validated Platform: i.MX8MP with the HiFi4 DSP core.
49+
- Porting: It is compatible with other boards, but this requires creating a new board configuration and updating the DTS overlays to match the target hardware.
50+
51+
Building and Running
52+
********************
53+
54+
West Module Filters
55+
-------------------
56+
This sample requires the tflite-micro module.
57+
58+
DSP Firmware
59+
------------
60+
61+
Add the tflite-micro module to your West manifest and pull it:
62+
63+
.. code-block:: console
64+
65+
west config manifest.project-filter -- +tflite-micro
66+
west update
67+
68+
The sample can be built for the :zephyr:board:`imx8mp_evk/mimx8ml8/adsp` as follows:
69+
70+
.. zephyr-app-commands::
71+
:zephyr-app: samples/modules/tflite-micro/micro_speech
72+
:host-os: unix
73+
:board: imx8mp_evk/mimx8ml8/adsp
74+
:goals: run
75+
:compact:
76+
77+
Linux Application
78+
-----------------
79+
80+
The Linux application is not part of the Zephyr repository. It can be found in the `this repository`_.
81+
82+
.. _this repository:
83+
https://github.com/thong-phn/linux-app
84+
85+
Sample Output
86+
*************
87+
88+
Linux Application
89+
-----
90+
91+
Simulation with a WAV file as input
92+
93+
.. code-block:: console
94+
95+
root@imx8mpevk:~# ./send default16.wav
96+
[L] Using TTY device: /dev/ttyRPMSG0
97+
[L] Expect audio frames: 500
98+
[L] Consumer: Consumer thread started
99+
[L] Producer: Producer thread started
100+
[L] Producer: End of file reached
101+
[L] Producer: Producer stopping
102+
[L] Consumer: EOF frame received, stopping
103+
[L] Consumer: EOF marker sent to Zephyr
104+
[L] Consumer: Consumer thread finished
105+
106+
Real-time Recording
107+
108+
.. code-block:: console
109+
110+
root@imx8mpevk:~# ./record hw:5,0 /dev/ttyRPMSG0
111+
[L] Using PCM device: hw:5,0
112+
[L] Using TTY device: /dev/ttyRPMSG0
113+
[L] PCM device hw:5,0 configured for 16kHz, S16_LE, Mono
114+
[L] Consumer: Consumer thread started
115+
[L] Producer: Producer thread started
116+
^C
117+
[L] Ctrl+C detected. Stopping..
118+
[L] Producer: Sending EOF to consumer
119+
[L] Producer: Producer stopping
120+
[L] Consumer: EOF frame received, stopping
121+
[L] Consumer: EOF marker sent to Zephyr
122+
[L] Consumer: Consumer thread finished
123+
[L] Application finished.
124+
125+
DSP Firmware
126+
------------
127+
128+
.. code-block:: console
129+
130+
[00:00:00.697,000] <inf> micro_speech_openamp: Starting Micro Speech OpenAMP application
131+
[00:00:01.231,000] <inf> micro_speech_openamp: Audio processing thread started
132+
[00:00:02.321,000] <inf> micro_speech_openamp: Audio processing thread started
133+
[00:00:03.591,000] <inf> model_runner: Initializing static interpreters
134+
[00:00:03.941,000] <inf> model_runner: Static interpreters initialized successfully
135+
[00:00:04.981,000] <inf> model_runner: Detected: yes
136+
[00:00:06.102,000] <inf> model_runner: Detected: no
137+
[00:00:07.202,000] <inf> model_runner: Detected: silence
138+
139+
Training
140+
********
141+
To train your own model for use in this sample, follow the instructions in `this link`_.
142+
143+
.. _this link:
144+
https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/micro_speech/train
145+
146+
Limitations
147+
********
148+
The basic model uses an inference audio frame size of 1000 ms.
149+
As a result, there are some limitations:
150+
151+
#. If two commands are spoken within 1000 ms, the second command may not be detected.
152+
#. If a command lasts longer than 1000 ms, it may be detected as two separate commands.
153+
154+
Potential solution: Retrain the model with a smaller input frame size.

0 commit comments

Comments
 (0)