Skip to content

Latest commit

 

History

History
413 lines (325 loc) · 26.4 KB

File metadata and controls

413 lines (325 loc) · 26.4 KB

Frequently Asked Questions

General:


Create Your Own TinyML Solution:


How are design files and related scripts organized?

The directory structure of Efinix TinyML repo is depicted below:

├── docs
├── model_zoo
│   ├── deep_autoencoder_anomaly_detection
│   ├── ds_cnn_keyword_spotting
│   ├── mediapipe_face_landmark_detection
│   ├── mobilenetv1_person_detection
│   ├── resnet_image_classification
│   └── yolo_person_detection
├── quick_start
│   └── picam_v3
├── tinyml_hello_world
│   ├── Ti60F225_tinyml_hello_world
│   │    ├── embedded_sw
│   │    ├── ip
│   │    ├── ref_files
│   │    └── source
│   ├── Ti180J484_tinyml_hello_world
│   │    ├── embedded_sw
│   │    ├── ip
│   │    ├── ref_files
│   │    └── source
│   └── Ti375C529_tinyml_hello_world
│       ├── embedded_sw
│       ├── ip
│       ├── ref_files
│       └── source
├── tinyml_vision
│   ├── Ti60F225_mediapipe_face_landmark_demo
│   │   ├── embedded_sw
│   │   ├── ip
│   │   ├── ref_files
│   │   └── source
│   ├── Ti60F225_mobilenetv1_person_detect_demo
│   │   ├── embedded_sw
│   │   ├── ip
│   │   ├── ref_files
│   │   └── source
│   ├── Ti60F225_yolo_person_detect_demo
│   │   ├── embedded_sw
│   │   ├── ip
│   │   ├── ref_files
│   │   └── source
│   ├── Ti180J484_mediapipe_face_landmark_demo
│   │   ├── embedded_sw
│   │   ├── ip
│   │   ├── ref_files
│   │   └── source
│   ├── Ti180J484_mobilenetv1_person_detect_demo
│   │   ├── embedded_sw
│   │   ├── ip
│   │   ├── ref_files
│   │   └── source
│   └── Ti180J484_yolo_person_detect_demo
│   │   ├── embedded_sw
│   │   ├── ip
│   │   ├── ref_files
│   │   └── source
│   └── Ti375C529_multicore_demo
│       ├── embedded_sw
│       ├── ip
│       ├── ref_files
│       └── source
└── tools
    └── tinyml_generator

For TinyML Hello World design, the project structure is depicted below :

├── tinyml_hello_world
│   ├── <device>_tinyml_hello_world
│   │    ├── embedded_sw
│   │    │   └── SapphireSoc
│   │    │       └── software
│   │    │           └── standalone
│   │    │               ├── common
│   │    │               ├── tinyml_fl
│   │    │               ├── tinyml_imgc
│   │    │               ├── tinyml_kws
│   │    │               ├── tinyml_pdti8
│   │    │               ├── tinyml_ypd
│   │    │               └── tinyml_ad
│   │    ├── ip
│   │    ├── ref_files
│   │    │   ├── bootloader_16MB
│   │    │   └── user_def_accelerator
│   │    └── source
│   │        ├── axi
│   │        ├── common
│   │        ├── hw_accel
│   │        └── tinyml

For TinyML Vision design, the project structure is depicted below:

├── tinyml_vision
│   ├── <device>_<architecture>_<application>_demo
│   │   ├── embedded_sw
│   │   │   └── SapphireSoc
│   │   │       └── software
│   │   │           └── standalone
│   │   │               ├── common
│   │   │               └── evsoc_tinyml_<application_alias>
│   │   ├── ip
│   │   ├── ref_files
│   │   │   └── bootloader_16MB
│   │   └── source
│   │       ├── axi
│   │       ├── cam
│   │       ├── common
│   │       ├── display
│   │       ├── hw_accel
│   │       └── tinyml

Note: Source files for Efinix soft-IP(s) are to be generated using IP Manager in Efinity® IDE, where IP settings files are provided in ip directory in respective project folder.


How much resources are consumed by Efinix TinyML designs?

Resource utilization tables compiled for Efinix Titanium® Ti60F225 device using Efinity® IDE v2025.2 are as follows.

Resource utilization for TinyML Hello World design:

Building Block XLR FF ADD LUT MEM (M10K) DSP
TinyML Hello World (Total) 60233 31621 6747 38254 177 52
RISC-V SoC - 6841 696 5535 43 4
DMA Controller - 3563 449 4915 19 0
HyperRAM Controller Core - 879 170 853 27 0
Hardware Accelerator* (Dummy) - 98 64 51 0 2
Efinix TinyML Accelerator - 20071 5348 26005 87 46

Resource utilization for Edge Vision TinyML Yolo Person Detection Demo design:

Building Block XLR FF ADD LUT MEM (M10K) DSP
Person Detection Demo (Total) 57100 27773 5935 37658 239 29
RISC-V SoC - 6892 703 5575 43 4
DMA Controller - 4434 517 6010 34 0
HyperRAM Controller Core - 879 170 852 27 0
CSI​-2 RX Controller Core - 932 161 1861 15 0
DSI TX Controller Core - 1954 488 3428 25 0
Camera - 778 903 569 11 0
Display - 338 173 288 8 0
Display Annotator - 1167 36 3496 4 0
Hardware Accelerator* - 334 153 150 4 2
Efinix TinyML Accelerator - 9705 2590 14135 63 15

Resource utilization tables compiled for Efinix Titanium® Ti180J484 device using Efinity® IDE v2025.2 are as follows.

Resource utilization for TinyML Hello World design:

Building Block XLR FF ADD LUT MEM (M10K) DSP
TinyML Hello World (Total) 163990 78080 20137 108967 555 220
RISC-V SoC - 11965 761 7628 87 4
DMA Controller - 7820 955 14235 49 0
Hardware Accelerator* (Dummy) - 349 168 141 4 2
Efinix TinyML Accelerator - 57786 18233 84899 414 214

Resource utilization for Edge Vision TinyML Yolo Person Detection Demo design:

Building Block XLR FF ADD LUT MEM (M10K) DSP
Person Detection Demo (Total) 156865 70812 21723 103022 652 181
RISC-V SoC - 12072 768 7755 87 4
DMA Controller - 8687 1027 15275 65 0
CSI​-2 RX Controller Core - 700 126 1421 17 0
Camera - 743 924 601 11 0
Display - 673 224 335 45 0
Display Annotator - 1167 36 3440 4 0
Hardware Accelerator* - 350 169 143 4 2
Efinix TinyML Accelerator - 46150 18429 71699 417 175

Resource utilization tables compiled for Efinix Titanium® Ti375C529 device using Efinity® IDE v2025.2 are as follows.

Resource utilization for Edge Vision TinyML Multicore Demo design:

Building Block XLR FF ADD LUT MEM (M10K) DSP
Person Detection Demo (Total) 231241 104766 45563 143375 972 387
Sapphire HP SoC Slb - 1024 245 1036 4 0
DMA Controller - 8687 1129 14981 65 0
CSI​-2 RX Controller Core - 930 157 1815 15 0
Camera - 792 919 602 11 0
Display - 673 224 338 45 0
Display Annotator - 2507 45 3575 20 0
Hardware Accelerator* - 523 199 268 16 4
Efinix TinyML Accelerator - 89371 42583 120372 795 382

* Hardware accelerator consists of pre-processing blocks for inference. For the MobileNetv1 Person Detection Demo design, the pre-processing blocks are image downscaling, RGB to grayscale conversion, and grayscale pixel packing. Refer to the defines.v for respective design TinyML accelerator configuration

Note: Resource values may vary from compile-to-compile due to PnR and updates in RTL. The presented tables are served as reference purposes.


Why compile provided example designs using Efinity RISC-V Embedded Software IDE failed?

User is required to generate Sapphire RISC-V SoC IP using IP Manager in Efinity® IDE. RISC-V SoC IP related contents for software are generated in embedded_sw folder.


Why compile Efinix TinyML Accelerator with a freshly created Efinity project gives error?

This is due to the Verilog version of a freshly created Efinity project is default to verilog_2k, however the syntax used in Efinix TinyML accelerator is based on SystemVerilog2009. User may update the Efinity Project setting accordingly, Project Editor -> Design tab -> Default Version -> Verilog -> SystemVerilog2009.


How to compile AI inference software app with different optimization?

By default, the software app is compiled with 03 flag which is optimized for speed performance. To change the optimization option for software app compilation, modify the software app Makefile option.

DEBUG    //for O0 flag which is the default debugging option
DEBUG_OG //for 0g flag which is the opimized debugging option
BENCH    //for 03 flag which is the optimized speed performance option 

Where are AI training and quantization scripts located?

AI model training and quantization scripts are located in model_zoo directory. Refer to model_zoo directory for more details regarding AI models, training and quantization.


How to make use of outputs generated from model zoo training and quantization flow for inference purposes?

There are two output files generated from the training and post-training quantization flow i.e., <architecture>_<application>_model_data.h and <architecture>_<application>_model_data.cc. The generated output files contain model data of the quantized model. In the provided example/demo designs, they are placed in the <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/model folder.

The model data header is included in the main.cc in corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/model directory. The model data is assigned to TFlite interpreter through the command below:

   model = tflite::GetModel(<architecture>_<application>_model_data);

How to selectively turn on or off particular layer accelerators?

By default, the provided example/demo designs are with Efinix TinyML accelerator enabled, where it is set in tinyml_core0_define.v for single core designs and tinyml_core0_define.v, tinyml_core1_define.v, tinyml_core2_define.v, and tinyml_core3_define.v for multi core designs. *define.v file(s) can be generated using Efinix TinyML Generator. A read-back mechanism is implemented in software to automatically match the hardware accelerator configurations that are set in *define.v.

To selectively turn on or off certain layer accelerators, user can make use of Efinix TinyML Generator to disable Efinix TinyML accelerator accordingly in hardware. Alternatively, user may selectively set .*_en variables to 0 to turn off particular layer accelerators and set .override_flag to 1 to overwrite the accelerator settings (software) in <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/platform/tinyml/accel_settings.cc. Below is example of turning off all layer accelerators on core 0 ([0]) in accel_settings.cc.

       [0] = {.cache_en = 0,
        .conv_depthw_en = 0,
        .add_en = 0,
        .fc_en = 0,
        .mul_en = 0,
        .lr_en = 0,
        .min_max_en = 0,
        .reshape_en = 0,
        .override_flag = 1},  // Core 0

How do accelerators work in a multi-core design?

When running a multi-core design, user must set accelerator configurations for each of the four cores available in the multi-core design through tinyml_core0_define.v, tinyml_core1_define.v, tinyml_core2_define.v, and tinyml_core3_define.v design files.

To do so, user must select MULTICORE option for CPU CORE parameter and select on which CPU ID to deploy the accelerators on, where CPU ID corresponds to each of the CPU CORE (0, 1, 2, and 3) available in Efinix TinyML Generator.


How to perform profiling of an AI model running on RISC-V?

To perform profiling i.e., to determine execution time of a quantized AI model running on RISC-V, make the following modification in the main.cc of the corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src directory to enable the profiler.

   //error_reporter, nullptr); //Without profiler
   error_reporter, &prof);     //With profiler

Build and run the particular software app of interest, the profiling result will be printed on the UART terminal.

The profiling convention is as follows:

<layer_number>; <layer_name>; <layer_mode>; <execution_time_in_ms>

The sample of profiling printed is shown below:

0; CONV_2D; STANDARD; 30ms
1; MUL; STANDARD; 6ms
2; MAXIMUM; STANDARD; 5ms
3; DEPTHWISE_CONV_2D; STANDARD; 41ms
4; CONV_2D; STANDARD; 9ms
5; ADD; STANDARD; 5ms

How to boot a complete TinyML design from flash?

A complete TinyML design consists of hardware/RTL (FPGA bitstream) and software/firmware (software binary). FPGA bitstream is generated from Efinity® IDE compilation, whereas software binary is generated from Efinity RISC-V Embedded Software IDE compilation. By default, there is a RISC-V bootloader that copies 16MB user binary from flash to main memory for execution upon boot-up.

Refer to EVSoC User Guide Copy a User Binary to Flash (Efinity Programmer) section for steps to combine FPGA bitstream and user application binary using Efinity Programmer, as well as boot the design from flash.

Note: The RISC-V application binary address for Ti375C529 Efinix Vision TinyML demo designs is 0x0050_0000.


How to modify Efinix Vision TinyML demo designs to use Raspberry Pi Camera v3 instead of Raspberry Pi Camera v2?

  • To use Raspberry Pi Camera v3, add the following line in main.cc:
#define PICAM_VERSION 3

How to run static input inference on a different test image with provided example quantized models?

In the provided TinyML Hello World example designs, test image input data for static inference is defined in header file placed in corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/model folder. For example, quant_airplane.h and quant_bird.h contain the airplane and bird test image, respectively, for the ResNet image classification model.

The test image data header is included in the main.cc in corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src directory. The image data is assigned to TFLite interpreter input through the command below:

   for (unsigned int i = 0; i < quant_airplane_dat_len; ++i)
      model_input->data.int8[i] = quant_airplane_dat[i];

User may use a different test input data for inference by creating a header file that contains the corresponding input data. For inference with image input, the input data is typically the grayscale or RGB pixel data of the test image. The input colour format, total data size, data type, etc., are determined during the AI model training/quantization stage. It is important to ensure the provided test data fulfil the input requirement of the quantized AI model used for inference.


How to add user-defined accelerator?

RISC-V custom instruction interface includes a 10-bit function ID signal, where up to 1024 custom instructions can be implemented. As coded in the tinyml_top module (<proj_directory>/source/tinyml/tinyml_top.v), function IDs with MSB 0 (with up to 512 custom instructions) are reserved for Efinix TinyML accelerator, whereas the rest of the function IDs can be used to implement user-defined accelerator as per application need.

To demonstrate how to add a user-defined accelerator, a minimum maximum Lite accelerator example is provided in tinyml_hello_world/<proj_directory>/ref_files/user_def_accelerator.

  1. Copy the files in hardware folder to <proj_directory>/source/tinyml.
  2. Copy the files in software folder to <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/tensorflow/lite/kernels/internal/reference.
  3. Compile the hardware using Efinity® IDE, build the software using Efinity RISC-V Embedded Software IDE, and run the application.

How to customize Efinix TinyML accelerator for different resource-performance trade-offs?

A GUI-based Efinix TinyML Generator is provided to facilitate the customization of Efinix TinyML Accelerator.

Efinix TinyML Accelerator supports two modes, which is customizable by layer type:

  1. Lite mode - Lightweight accelerator that consumes less resources.
  2. Standard mode - High performance accelerator that consumes more resources.

How to train and quantize a different AI model for running on Efinix TinyML platform?

Refer to Efinix Model Zoo for examples on how to make use of the training and quantization scripts based on different training frameworks and datasets. The training and quantization examples are provided as Jupyter Notebook, which runs on Google Colab. To make use of the produced quantized model data for inference purposes, refer to this FAQ.

If user has an own pre-trained network (floating point model), the training stage can be skipped. User may proceed with model quantization and perform conversion from .tflite quantized model to the corresponding .h and .cc files for inference purposes.


How to run inference with a different quantized AI model using Efinix TinyML platform?

Refer to this FAQ for training and quantization of a different AI model. To test out the quantized model, it is recommended to try out inference of targeted model using the TinyML Hello World design, which takes in static input data. In addition, it is recommended to run inference in pure software mode i.e., disabled TinyML accelerator (refer to this FAQ), as this would help to isolate potential setting/design issues to either software (TFlite Micro library and inference setup) or hardware (TinyML accelerator).

With TinyML accelerator disabled - pure software inference, some adjustments may be required for running a different AI model. This is due to there might be variations in the overall model size, layers/operations, input/output format, normalization, etc., for different AI models. Followings are some tips for making the necessary adjustments:

  • Refer to this FAQ on how to include quantized model for inference purposes.
  • Refer to this FAQ on how to include a different test input data.
  • If seeing Allocate Tensor Failed error message on UART terminal during inference execution, adjust tensor arena size in main.cc.
  • If seeing Insufficient memory region size allocated error message during Efinity RISC-V Embedded Software IDE build project, adjust Application Region Size parameter of Sapphire SoC IP using Efinity® IDE IP Manager accordingly. It is important to ensure the adjusted Application Region Size does not exceed the external memory RAM size.

After running inference successfully with the targeted AI model (with expected inference score/output) in pure software mode, user may enable Efinix TinyML accelerator for hardware speed-up. Refer to Efinix TinyML Generator for enabling/customizing Efinix TinyML accelerator for the targeted model.


How to implement a TinyML solution using Efinix TinyML platform?

To implement a TinyML solution for vision application, user may make use of the presented Efinix Edge Vision TinyML framework. For more details about the flexible domain-specific Edge Vision SoC framework, visit Edge Vision SoC webpage. Furthermore, user may refer to the provided demo design on Edge Vision TinyML framework for the interfacing and integration of a working vision AI system with camera and display.

  • Refer to this FAQ for training and quantization of an AI model.
  • Refer to this FAQ for running inference with a quantized AI model on Efinix TinyML platform.

My Efinix TinyML design fails to compile due to timing errors. What should I do?

In the Efinity software:

  1. Select File -> Edit Project -> Place and Route.
  2. Adjust the following parameter to get a better timing outcome:
    • optimization_level
    • seed
    • placer_effort_level

Refer to Efinity Timing Closure User Guide for more details on these options and how to optimize timing closure.