|
| 1 | +--- |
| 2 | +title: "Ethos-U85 NPU Applications with TOSA Model Explorer: Exploring Next-Gen Edge AI Inference" |
| 3 | +description: "Push the limits of Edge AI by deploying the heaviest inference applications possible on Ethos-U85. Students will explore transformer-based and TOSA-optimized workloads that demonstrate performance levels on the next-gen of Ethos NPUs." |
| 4 | +subjects: |
| 5 | + - "ML" |
| 6 | + - "Performance and Architecture" |
| 7 | +requires-team: |
| 8 | + - "No" |
| 9 | +platform: |
| 10 | + - "IoT" |
| 11 | + - "Embedded and Microcontrollers" |
| 12 | + - "AI" |
| 13 | +sw-hw: |
| 14 | + - "Software" |
| 15 | + - "Hardware" |
| 16 | +support-level: |
| 17 | + - "Self-Service" |
| 18 | + - "Arm Ambassador Support" |
| 19 | +publication-date: 2025-11-27 |
| 20 | +license: |
| 21 | +status: |
| 22 | + - "Published" |
| 23 | +donation: |
| 24 | +--- |
| 25 | + |
| 26 | + |
| 27 | + |
| 28 | +## Description |
| 29 | + |
| 30 | +**Why is this important?** |
| 31 | + |
| 32 | +The Arm Ethos-U85 NPU represents a major leap in bringing *heavy inference* to constrained embedded systems. With its full transformer operator support, expanded MAC throughput, and native TOSA compatibility, the Ethos-U85 enables developers to deploy models and workloads that were previously too intensive for MCU-class devices. |
| 33 | + |
| 34 | +This project challenges you to explore the boundaries of what’s possible on Ethos-U85. The goal is to demonstrate inference performance and model complexity that is now achievable due to the architectural improvements and transformer acceleration capabilities of the Ethos-U85. |
| 35 | + |
| 36 | +[Ethos-U85 Launch](https://newsroom.arm.com/blog/ethos-u85) |
| 37 | + |
| 38 | +**Project Summary** |
| 39 | + |
| 40 | +Using hardware such as the Alif Ensemble E4/E6/E8 DevKits (all include Ethos-U85) or a comparable platform or Arm Fixed Virtual Platform Corstone-320, your task is to design and benchmark an advanced edge inference application that exploits the Ethos-U85’s compute and transformer capabilities. |
| 41 | + |
| 42 | +Your project should include: |
| 43 | + |
| 44 | +1. Model Deployment and Optimization |
| 45 | + Select a computationally intensive model — ideally transformer-based or multi-branch convolutional — and deploy it on the Ethos-U85 using: |
| 46 | + - The TOSA Model Explorer extension to inspect and adapt unsupported or experimental models for TOSA compliance. |
| 47 | + - The Vela compiler for optimization. |
| 48 | + |
| 49 | + These tools can be used to: |
| 50 | + - Convert and visualize model graphs in TOSA format. |
| 51 | + - Identify unsupported operators. |
| 52 | + - Modify or substitute layers for compatibility using the Flatbuffers schema before re-exporting. |
| 53 | + - Run Vela for optimized compilation targeting Ethos-U85. |
| 54 | + |
| 55 | +2. Application Demonstration |
| 56 | + Implement a working example that highlights the Ethos-U85’s strengths in real-world inference. Possible categories include: |
| 57 | + - Transformers on Edge: lightweight BERT, ViT, or audio transformers (e.g. speech or sound event classification). |
| 58 | + - High-resolution Vision: semantic segmentation, object detection on large input sizes, or multi-head perception networks. |
| 59 | + - Multi-modal Fusion: combining audio, image, or sensor streams for contextual understanding. |
| 60 | + |
| 61 | +3. Analysis and Benchmarking |
| 62 | + Report quantitative results on: |
| 63 | + - Inference latency, throughput (FPS or tokens/s), and memory footprint. |
| 64 | + - Power efficiency under load (optional). |
| 65 | + - Comparative performance versus Ethos-U55/U65 (use available benchmarks for reference or utilise the other Ethos-U NPUs provided in the Alif DevKits). |
| 66 | + - The effect of TOSA optimization — demonstrate measurable improvements from graph conversion and operator fusion. |
| 67 | + |
| 68 | +--- |
| 69 | + |
| 70 | +## What kind of projects should you target? |
| 71 | + |
| 72 | +To clearly demonstrate the leap from Ethos-U55/U65 to U85, choose projects that meet at least one of the following criteria: |
| 73 | + |
| 74 | +- Transformer-heavy architectures: e.g. attention blocks, transformer encoders, ViTs, or hybrid CNN+transformer models. |
| 75 | + - *Example:* an audio event detection transformer that must process longer sequences or higher-resolution spectrograms. |
| 76 | +- High-resolution or multi-branch networks: models with high input dimensionality or multiple processing paths that saturate NPU throughput. |
| 77 | + - *Example:* 512×512 semantic segmentation or multi-object detection. |
| 78 | +- Dense post-processing or large fully connected layers: cases where U55/U65 memory limits or MAC bandwidth previously restricted performance. |
| 79 | + - *Example:* large MLP heads or transformer token mixers. |
| 80 | +- Multi-modal pipelines: combining multiple sensor inputs (e.g. image + IMU + audio) where the NPU must maintain concurrency or shared intermediate representations. |
| 81 | + |
| 82 | +The Ethos-U85 is ideal for projects where model performance is constrained by attention layers, large activations, or operator types that previously required fallback to the CPU. Use the Ethos-U85 to eliminate those fallbacks and achieve full-NPU execution of advanced topologies. |
| 83 | + |
| 84 | +--- |
| 85 | + |
| 86 | +## What will you use? |
| 87 | +You should be familiar with, or willing to learn about: |
| 88 | +- Programming: Python, C/C++ |
| 89 | +- ExecuTorch or TensorFlow Lite (Micro/LiteRT) |
| 90 | +- Techniques for optimising AI models for the edge (quantization, pruning, etc.) |
| 91 | +- Optimization Tools: |
| 92 | + - TOSA Model Explorer |
| 93 | + - .tflite to .tosa converter (if using Tensorflow rather than ExecuTorch) |
| 94 | + - Vela compiler for Ethos-U |
| 95 | +- Bare-metal or RTOS (e.g., Zephyr) |
| 96 | + |
| 97 | +--- |
| 98 | + |
| 99 | +## Resources from Arm and our partners |
| 100 | +- Arm Developer: [Edge AI](https://developer.arm.com/edge-ai) |
| 101 | +- Learning Path: [Navigating Machine Learning with Ethos-U processors](https://learn.arm.com/learning-paths/microcontrollers/nav-mlek/) |
| 102 | +- Repository: [AI on Arm course](https://github.com/arm-university/AI-on-Arm) |
| 103 | +- Example Board: [Alif Ensemble DevKit E8](https://www.keil.arm.com/boards/alif-semiconductor-devkit-e8-gen-1-2558a7b/features/) |
| 104 | +- Documentation: [TOSA Specification](https://www.mlplatform.org/tosa/), [TOSA Model Explorer](https://github.com/arm/tosa-adapter-model-explorer), and [TOSA Reference Model](https://gitlab.arm.com/tosa/tosa-reference-model) |
| 105 | +- PyTorch Blog: [ExecuTorch support for Ethos-U85](https://pytorch.org/blog/pt-executorch-ethos-u85/) |
| 106 | +--- |
| 107 | + |
| 108 | +## Support Level |
| 109 | + |
| 110 | +This project is designed to be self-serve but comes with opportunity of some community support from Arm Ambassadors, who are part of the Arm Developer program. If you are not already part of our program, [click here to join](https://www.arm.com/resources/developer-program?#register). |
| 111 | + |
| 112 | +## Benefits |
| 113 | + |
| 114 | +Standout project contributions will result in digital badges for CV building, recognised by Arm Talent Acquisition. We are currently discussing with national agencies the potential for funding streams for Arm Developer Labs projects, which would flow to you, not us. |
| 115 | + |
| 116 | + |
| 117 | +To receive the benefits, you must show us your project through our [online form](https://forms.office.com/e/VZnJQLeRhD). Please do not include any confidential information in your contribution. Additionally if you are affiliated with an academic institution, please ensure you have the right to share your material. |
0 commit comments