|
| 1 | +.. zephyr:code-sample:: tflite-micro-speech-openamp |
| 2 | + :name: Micro Speech OpenAMP |
| 3 | + |
| 4 | + Recognize speech commands from audio input received on Cortex-A cores and |
| 5 | + processed on the HiFi4 DSP of the i.MX8M Plus EVK board using TensorFlow Lite |
| 6 | + for Microcontrollers with a 20KB neural network. |
| 7 | + |
| 8 | +Overview |
| 9 | +******** |
| 10 | + |
| 11 | +This sample requires an application running on the Cortex-A cores of the i.MX8M Plus |
| 12 | +to capture audio and send it to the HiFi4 DSP using OpenAMP. The DSP processes |
| 13 | +the audio data and performs inference using TensorFlow Lite Micro that |
| 14 | +detects 2 speech commands ("yes" and "no"), as well as "silence" and "unknown". |
| 15 | + |
| 16 | +.. code-block:: text |
| 17 | +
|
| 18 | + +------------------------- Cortex A (main core) -------------+ +--------------- HiFi4 DSP (remote core) --------------+ |
| 19 | + | | | | |
| 20 | + | [ALSA/arecord] -> [Linux userspace] -> [/dev/ttyRPMSG*] |----------> [RPMsg] ------->| [ring/msgq] -> [frontend] -> [TFLM] -> [output] | |
| 21 | + | | | | |
| 22 | + +------------------------------------------------------------+ +------------------------------------------------------+ |
| 23 | +
|
| 24 | +.. Note:: |
| 25 | + This README and sample have been modified from |
| 26 | + `the TensorFlow Hello World sample`_, |
| 27 | + `the OpenAMP using resource table from Zephyr`_ and |
| 28 | + `the Micro Speech Example from TensorFlow Lite for Microcontrollers`_. |
| 29 | + |
| 30 | +.. _the TensorFlow Hello World sample: |
| 31 | + https://github.com/tensorflow/tflite-micro-arduino-examples/tree/main/examples/hello_world |
| 32 | + |
| 33 | +.. _the OpenAMP using resource table from Zephyr: |
| 34 | + https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/subsys/ipc/openamp_rsc_table |
| 35 | + |
| 36 | +.. _the Micro Speech Example from TensorFlow Lite for Microcontrollers: |
| 37 | + https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/micro_speech |
| 38 | + |
| 39 | +Audio contract |
| 40 | +-------------- |
| 41 | +- Sample rate: 16kHz |
| 42 | +- Sample format: S16_LE |
| 43 | +- Frame size (samples per RPMsg payload): 20ms (320 samples or 640 bytes) |
| 44 | +- Endianness: LE |
| 45 | + |
| 46 | +Compatibility |
| 47 | +------------- |
| 48 | +- Validated Platform: i.MX8MP with the HiFi4 DSP core. |
| 49 | +- Porting: It is compatible with other boards, but this requires creating a new board configuration and updating the DTS overlays to match the target hardware. |
| 50 | + |
| 51 | +Building and Running |
| 52 | +******************** |
| 53 | + |
| 54 | +West Module Filters |
| 55 | +------------------- |
| 56 | +This sample requires the tflite-micro module. |
| 57 | + |
| 58 | +DSP Firmware |
| 59 | +------------ |
| 60 | + |
| 61 | +Add the tflite-micro module to your West manifest and pull it: |
| 62 | + |
| 63 | +.. code-block:: console |
| 64 | +
|
| 65 | + west config manifest.project-filter -- +tflite-micro |
| 66 | + west update |
| 67 | +
|
| 68 | +The sample can be built for the :zephyr:board:`imx8mp_evk/mimx8ml8/adsp` as follows: |
| 69 | + |
| 70 | +.. zephyr-app-commands:: |
| 71 | + :zephyr-app: samples/modules/tflite-micro/micro_speech |
| 72 | + :host-os: unix |
| 73 | + :board: imx8mp_evk/mimx8ml8/adsp |
| 74 | + :goals: run |
| 75 | + :compact: |
| 76 | + |
| 77 | +Linux Application |
| 78 | +----------------- |
| 79 | + |
| 80 | +The Linux application is not part of the Zephyr repository. It can be found in the `this repository`_. |
| 81 | + |
| 82 | +.. _this repository: |
| 83 | + https://github.com/thong-phn/linux-app |
| 84 | + |
| 85 | +Sample Output |
| 86 | +************* |
| 87 | + |
| 88 | +Linux Application |
| 89 | +----- |
| 90 | + |
| 91 | +Simulation with a WAV file as input |
| 92 | + |
| 93 | +.. code-block:: console |
| 94 | +
|
| 95 | + root@imx8mpevk:~# ./send default16.wav |
| 96 | + [L] Using TTY device: /dev/ttyRPMSG0 |
| 97 | + [L] Expect audio frames: 500 |
| 98 | + [L] Consumer: Consumer thread started |
| 99 | + [L] Producer: Producer thread started |
| 100 | + [L] Producer: End of file reached |
| 101 | + [L] Producer: Producer stopping |
| 102 | + [L] Consumer: EOF frame received, stopping |
| 103 | + [L] Consumer: EOF marker sent to Zephyr |
| 104 | + [L] Consumer: Consumer thread finished |
| 105 | +
|
| 106 | +Real-time Recording |
| 107 | + |
| 108 | +.. code-block:: console |
| 109 | +
|
| 110 | + root@imx8mpevk:~# ./record hw:5,0 /dev/ttyRPMSG0 |
| 111 | + [L] Using PCM device: hw:5,0 |
| 112 | + [L] Using TTY device: /dev/ttyRPMSG0 |
| 113 | + [L] PCM device hw:5,0 configured for 16kHz, S16_LE, Mono |
| 114 | + [L] Consumer: Consumer thread started |
| 115 | + [L] Producer: Producer thread started |
| 116 | + ^C |
| 117 | + [L] Ctrl+C detected. Stopping.. |
| 118 | + [L] Producer: Sending EOF to consumer |
| 119 | + [L] Producer: Producer stopping |
| 120 | + [L] Consumer: EOF frame received, stopping |
| 121 | + [L] Consumer: EOF marker sent to Zephyr |
| 122 | + [L] Consumer: Consumer thread finished |
| 123 | + [L] Application finished. |
| 124 | +
|
| 125 | +DSP Firmware |
| 126 | +------------ |
| 127 | + |
| 128 | +.. code-block:: console |
| 129 | +
|
| 130 | + [00:00:00.697,000] <inf> micro_speech_openamp: Starting Micro Speech OpenAMP application |
| 131 | + [00:00:01.231,000] <inf> micro_speech_openamp: Audio processing thread started |
| 132 | + [00:00:02.321,000] <inf> micro_speech_openamp: Audio processing thread started |
| 133 | + [00:00:03.591,000] <inf> model_runner: Initializing static interpreters |
| 134 | + [00:00:03.941,000] <inf> model_runner: Static interpreters initialized successfully |
| 135 | + [00:00:04.981,000] <inf> model_runner: Detected: yes |
| 136 | + [00:00:06.102,000] <inf> model_runner: Detected: no |
| 137 | + [00:00:07.202,000] <inf> model_runner: Detected: silence |
| 138 | +
|
| 139 | +Training |
| 140 | +******** |
| 141 | +To train your own model for use in this sample, follow the instructions in `this link`_. |
| 142 | + |
| 143 | +.. _this link: |
| 144 | + https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/micro_speech/train |
| 145 | + |
| 146 | +Limitations |
| 147 | +******** |
| 148 | +The basic model uses an inference audio frame size of 1000 ms. |
| 149 | +As a result, there are some limitations: |
| 150 | + |
| 151 | +#. If two commands are spoken within 1000 ms, the second command may not be detected. |
| 152 | +#. If a command lasts longer than 1000 ms, it may be detected as two separate commands. |
| 153 | + |
| 154 | +Potential solution: Retrain the model with a smaller input frame size. |
0 commit comments