ZIP-CNN

ZIP-CNN: Design Space Exploration for CNN Implementation within a MCU

This is material and results for the paper "ZIP-CNN: Design Space Exploration for CNN Implementation within a MCU". The repository contains figures and the Python code used to train the CNN. The files .h5 contains the weights of the corresponding CNN. Several measurements were conducted on 3 microcontrollers units (STM32L496ZG - STM32F446ZE - STM32F746G) during an inference of CNN. Results are available in the measurements folder.

Feel free to send an email if there are questions.

If you use ZIP-CNN data in a scientific publication, we would appreciate using the following citations:

@article{10.1145/3691343,
author = {Garbay, Thomas and Hachicha, Khalil and Dobias, Petr and Pinna, Andrea and Hocine, Karim and Dron, Wilfried and Lusich, Pedro and Khalis, Imane and Granado, Bertrand},
title = {ZIP-CNN: Design Space Exploration for CNN Implementation within a MCU},
year = {2024},
issue_date = {January 2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {24},
number = {1},
issn = {1539-9087},
url = {https://doi.org/10.1145/3691343},
doi = {10.1145/3691343},
abstract = {Embedded systems based on Microcontroller Units (MCUs) often gather significant quantities of data and solve various issues. Convolutional Neural Networks (CNNs) have proven their effectiveness in solving computer vision and natural language processing tasks. However, implementing CNNs within MCUs is challenging due to their high inference costs, which varies widely depending on hardware targets and CNN topologies. Despite state-of-the-art advancements, no efficient design space exploration solutions handle the wide variety of implementation solutions. In this article, we introduce the ZIP-CNN design space exploration methodology, which facilitates CNN implementation within MCUs. We developed a model that quantitatively estimates the latency, energy consumption, and memory space required to run a CNN within an MCU. This model accounts for algorithmic reductions such as knowledge distillation, pruning, or quantization and applies to any CNN topology. To demonstrate the efficiency of our methodology, we investigated LeNet5, ResNet8, and ResNet26 within three different MCUs. We made materials and supplementary results available in a GitHub repository: . The proposed method was empirically verified on three hardware targets running at 14 different operating frequencies. The three CNN topologies investigated were implemented in their default configuration in FP32, and also reduced with INT8 quantization, pruning at five different rates and with knowledge distillation. The estimates of our model are very reliable with an error of 3.29\% to 15.23\% for latency, 3.12\% to 10.34\% for energy consumption, and 1.95\% to 6.31\% for memory space. These results are based on on-device measurements.},
journal = {ACM Trans. Embed. Comput. Syst.},
month = sep,
articleno = {5},
numpages = {26},
keywords = {Design space exploration, neural networks, microcontroller units, tiny machine learning}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Code_CNN		Code_CNN
Figures_article		Figures_article
Measurements		Measurements
Methodology_Zip-CNN		Methodology_Zip-CNN
Weights_CNN		Weights_CNN
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ZIP-CNN

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ZIP-CNN

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages