| 
1 | 1 | <div align="center">  | 
2 |  | -  <img src="docs/source/_static/img/et-logo.png" alt="Logo" width="200">  | 
3 |  | -  <h1 align="center">ExecuTorch: A powerful on-device AI Framework</h1>  | 
 | 2 | +  <img src="docs/source/_static/img/et-logo.png" alt="ExecuTorch logo mark" width="200">  | 
 | 3 | +  <h1>ExecuTorch</h1>  | 
 | 4 | +  <p><strong>On-device AI inference powered by PyTorch</strong></p>  | 
4 | 5 | </div>  | 
5 | 6 | 
 
  | 
6 |  | - | 
7 | 7 | <div align="center">  | 
8 |  | -  <a href="https://github.com/pytorch/executorch/graphs/contributors"><img src="https://img.shields.io/github/contributors/pytorch/executorch?style=for-the-badge&color=blue" alt="Contributors"></a>  | 
9 |  | -  <a href="https://github.com/pytorch/executorch/stargazers"><img src="https://img.shields.io/github/stars/pytorch/executorch?style=for-the-badge&color=blue" alt="Stargazers"></a>  | 
10 |  | -  <a href="https://discord.gg/Dh43CKSAdc"><img src="https://img.shields.io/badge/Discord-Join%20Us-purple?logo=discord&logoColor=white&style=for-the-badge" alt="Join our Discord community"></a>  | 
11 |  | -  <a href="https://pytorch.org/executorch/main/index"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation"></a>  | 
12 |  | -  <hr>  | 
 | 8 | +  <a href="https://pypi.org/project/executorch/"><img src="https://img.shields.io/pypi/v/executorch?style=for-the-badge&color=blue" alt="PyPI - Version"></a>  | 
 | 9 | +  <a href="https://github.com/pytorch/executorch/graphs/contributors"><img src="https://img.shields.io/github/contributors/pytorch/executorch?style=for-the-badge&color=blue" alt="GitHub - Contributors"></a>  | 
 | 10 | +  <a href="https://github.com/pytorch/executorch/stargazers"><img src="https://img.shields.io/github/stars/pytorch/executorch?style=for-the-badge&color=blue" alt="GitHub - Stars"></a>  | 
 | 11 | +  <a href="https://discord.gg/Dh43CKSAdc"><img src="https://img.shields.io/badge/Discord-Join%20Us-blue?logo=discord&logoColor=white&style=for-the-badge" alt="Discord - Chat with Us"></a>  | 
 | 12 | +  <a href="https://docs.pytorch.org/executorch/main/index.html"><img src="https://img.shields.io/badge/Documentation-blue?logo=googledocs&logoColor=white&style=for-the-badge" alt="Documentation"></a>  | 
13 | 13 | </div>  | 
14 | 14 | 
 
  | 
15 |  | -**ExecuTorch** is an end-to-end solution for on-device inference and training. It powers much of Meta's on-device AI experiences across Facebook, Instagram, Meta Quest, Ray-Ban Meta Smart Glasses, WhatsApp, and more.  | 
 | 15 | +**ExecuTorch** is PyTorch's unified solution for deploying AI models on-device—from smartphones to microcontrollers—built for privacy, performance, and portability. It powers Meta's on-device AI across **Instagram, WhatsApp, Quest 3, Ray-Ban Meta Smart Glasses**, and [more](https://docs.pytorch.org/executorch/main/success-stories.html).  | 
 | 16 | + | 
 | 17 | +Deploy **LLMs, vision, speech, and multimodal models** with the same PyTorch APIs you already know—accelerating research to production with seamless model export, optimization, and deployment. No manual C++ rewrites. No format conversions. No vendor lock-in.  | 
 | 18 | + | 
 | 19 | +<details>  | 
 | 20 | +  <summary><strong>📘 Table of Contents</strong></summary>  | 
 | 21 | + | 
 | 22 | +- [Why ExecuTorch?](#why-executorch)  | 
 | 23 | +- [How It Works](#how-it-works)  | 
 | 24 | +- [Quick Start](#quick-start)  | 
 | 25 | +  - [Installation](#installation)  | 
 | 26 | +  - [Export and Deploy in 3 Steps](#export-and-deploy-in-3-steps)  | 
 | 27 | +  - [Run on Device](#run-on-device)  | 
 | 28 | +  - [LLM Example: Llama](#llm-example-llama)  | 
 | 29 | +- [Platform & Hardware Support](#platform--hardware-support)  | 
 | 30 | +- [Production Deployments](#production-deployments)  | 
 | 31 | +- [Examples & Models](#examples--models)  | 
 | 32 | +- [Key Features](#key-features)  | 
 | 33 | +- [Documentation](#documentation)  | 
 | 34 | +- [Community & Contributing](#community--contributing)  | 
 | 35 | +- [License](#license)  | 
 | 36 | + | 
 | 37 | +</details>  | 
 | 38 | + | 
 | 39 | +## Why ExecuTorch?  | 
 | 40 | + | 
 | 41 | +- **🔒 Native PyTorch Export** — Direct export from PyTorch. No .onnx, .tflite, or intermediate format conversions. Preserve model semantics.  | 
 | 42 | +- **⚡ Production-Proven** — Powers billions of users at [Meta with real-time on-device inference](https://engineering.fb.com/2025/07/28/android/executorch-on-device-ml-meta-family-of-apps/).  | 
 | 43 | +- **💾 Tiny Runtime** — 50KB base footprint. Runs on microcontrollers to high-end smartphones.  | 
 | 44 | +- **🚀 [12+ Hardware Backends](https://docs.pytorch.org/executorch/main/backends-overview.html)** — Open-source acceleration for Apple, Qualcomm, ARM, MediaTek, Vulkan, and more.  | 
 | 45 | +- **🎯 One Export, Multiple Backends** — Switch hardware targets with a single line change. Deploy the same model everywhere.  | 
 | 46 | + | 
 | 47 | +## How It Works  | 
 | 48 | + | 
 | 49 | +ExecuTorch uses **ahead-of-time (AOT) compilation** to prepare PyTorch models for edge deployment:  | 
 | 50 | + | 
 | 51 | +1. **🧩 Export** — Capture your PyTorch model graph with `torch.export()`  | 
 | 52 | +2. **⚙️ Compile** — Quantize, optimize, and partition to hardware backends → `.pte`  | 
 | 53 | +3. **🚀 Execute** — Load `.pte` on-device via lightweight C++ runtime  | 
 | 54 | + | 
 | 55 | +Models use a standardized [Core ATen operator set](https://docs.pytorch.org/executorch/main/compiler-ir-advanced.html#intermediate-representation). [Partitioners](https://docs.pytorch.org/executorch/main/compiler-delegate-and-partitioner.html) delegate subgraphs to specialized hardware (NPU/GPU) with CPU fallback.  | 
 | 56 | + | 
 | 57 | +Learn more: [How ExecuTorch Works](https://docs.pytorch.org/executorch/main/intro-how-it-works.html) • [Architecture Guide](https://docs.pytorch.org/executorch/main/getting-started-architecture.html)  | 
 | 58 | + | 
 | 59 | +## Quick Start  | 
 | 60 | + | 
 | 61 | +### Installation  | 
 | 62 | + | 
 | 63 | +```bash  | 
 | 64 | +pip install executorch  | 
 | 65 | +```  | 
 | 66 | + | 
 | 67 | +For platform-specific setup (Android, iOS, embedded systems), see the [Quick Start](https://docs.pytorch.org/executorch/main/quick-start-section.html) documentation for additional info.  | 
 | 68 | + | 
 | 69 | +### Export and Deploy in 3 Steps  | 
 | 70 | + | 
 | 71 | +```python  | 
 | 72 | +import torch  | 
 | 73 | +from executorch.exir import to_edge_transform_and_lower  | 
 | 74 | +from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner  | 
 | 75 | + | 
 | 76 | +# 1. Export your PyTorch model  | 
 | 77 | +model = MyModel().eval()  | 
 | 78 | +example_inputs = (torch.randn(1, 3, 224, 224),)  | 
 | 79 | +exported_program = torch.export.export(model, example_inputs)  | 
 | 80 | + | 
 | 81 | +# 2. Optimize for target hardware (switch backends with one line)  | 
 | 82 | +program = to_edge_transform_and_lower(  | 
 | 83 | +    exported_program,  | 
 | 84 | +    partitioner=[XnnpackPartitioner()]  # CPU | CoreMLPartitioner() for iOS | QnnPartitioner() for Qualcomm  | 
 | 85 | +).to_executorch()  | 
 | 86 | + | 
 | 87 | +# 3. Save for deployment  | 
 | 88 | +with open("model.pte", "wb") as f:  | 
 | 89 | +    f.write(program.buffer)  | 
 | 90 | + | 
 | 91 | +# Test locally via ExecuTorch runtime's pybind API (optional)  | 
 | 92 | +from executorch.runtime import Runtime  | 
 | 93 | +runtime = Runtime.get()  | 
 | 94 | +method = runtime.load_program("model.pte").load_method("forward")  | 
 | 95 | +outputs = method.execute([torch.randn(1, 3, 224, 224)])  | 
 | 96 | +```  | 
 | 97 | + | 
 | 98 | +### Run on Device  | 
 | 99 | + | 
 | 100 | +**[C++](https://docs.pytorch.org/executorch/main/using-executorch-cpp.html)**  | 
 | 101 | +```cpp  | 
 | 102 | +#include <executorch/extension/module/module.h>  | 
 | 103 | +#include <executorch/extension/tensor/tensor.h>  | 
 | 104 | + | 
 | 105 | +Module module("model.pte");  | 
 | 106 | +auto tensor = make_tensor_ptr({2, 2}, {1.0f, 2.0f, 3.0f, 4.0f});  | 
 | 107 | +auto outputs = module.forward({tensor});  | 
 | 108 | +```  | 
 | 109 | +
  | 
 | 110 | +**[Swift (iOS)](https://docs.pytorch.org/executorch/main/ios-section.html)**  | 
 | 111 | +```swift  | 
 | 112 | +let module = Module(filePath: "model.pte")  | 
 | 113 | +let input = Tensor<Float>([1.0, 2.0, 3.0, 4.0])  | 
 | 114 | +let outputs: [Value] = try module.forward([input])  | 
 | 115 | +```  | 
 | 116 | + | 
 | 117 | +**[Kotlin (Android)](https://docs.pytorch.org/executorch/main/android-section.html)**  | 
 | 118 | +```kotlin  | 
 | 119 | +val module = Module.load("model.pte")  | 
 | 120 | +val inputTensor = Tensor.fromBlob(floatArrayOf(1.0f, 2.0f, 3.0f, 4.0f), longArrayOf(2, 2))  | 
 | 121 | +val outputs = module.forward(EValue.from(inputTensor))  | 
 | 122 | +```  | 
 | 123 | + | 
 | 124 | +### LLM Example: Llama  | 
 | 125 | + | 
 | 126 | +Export Llama models using the [`export_llm`](https://docs.pytorch.org/executorch/main/llm/export-llm.html) script or [Optimum-ExecuTorch](https://github.com/huggingface/optimum-executorch):  | 
 | 127 | + | 
 | 128 | +```bash  | 
 | 129 | +# Using export_llm  | 
 | 130 | +python -m executorch.extension.llm.export.export_llm --model llama3_2 --output llama.pte  | 
 | 131 | + | 
 | 132 | +# Using Optimum-ExecuTorch  | 
 | 133 | +optimum-cli export executorch \  | 
 | 134 | +  --model meta-llama/Llama-3.2-1B \  | 
 | 135 | +  --task text-generation \  | 
 | 136 | +  --recipe xnnpack \  | 
 | 137 | +  --output_dir llama_model  | 
 | 138 | +```  | 
16 | 139 | 
 
  | 
17 |  | -It supports a wide range of models including LLMs (Large Language Models), CV (Computer Vision), ASR (Automatic Speech Recognition), and TTS (Text to Speech).  | 
 | 140 | +Run on-device with the LLM runner API:  | 
18 | 141 | 
 
  | 
19 |  | -Platform Support:  | 
20 |  | -- Operating Systems:  | 
21 |  | -  - iOS  | 
22 |  | -  - MacOS (ARM64)  | 
23 |  | -  - Android  | 
24 |  | -  - Linux  | 
25 |  | -  - Microcontrollers  | 
 | 142 | +**[C++](https://docs.pytorch.org/executorch/main/llm/run-with-c-plus-plus.html)**  | 
 | 143 | +```cpp  | 
 | 144 | +#include <executorch/extension/llm/runner/text_llm_runner.h>  | 
26 | 145 | 
 
  | 
27 |  | -- Hardware Acceleration:  | 
28 |  | -  - Apple  | 
29 |  | -  - Arm  | 
30 |  | -  - Cadence  | 
31 |  | -  - MediaTek  | 
32 |  | -  - NXP  | 
33 |  | -  - OpenVINO  | 
34 |  | -  - Qualcomm  | 
35 |  | -  - Vulkan  | 
36 |  | -  - XNNPACK  | 
 | 146 | +auto runner = create_llama_runner("llama.pte", "tiktoken.bin");  | 
 | 147 | +executorch::extension::llm::GenerationConfig config{  | 
 | 148 | +    .seq_len = 128, .temperature = 0.8f};  | 
 | 149 | +runner->generate("Hello, how are you?", config);  | 
 | 150 | +```  | 
37 | 151 | 
  | 
38 |  | -Key value propositions of ExecuTorch are:  | 
 | 152 | +**[Swift (iOS)](https://docs.pytorch.org/executorch/main/llm/run-on-ios.html)**  | 
 | 153 | +```swift  | 
 | 154 | +let runner = TextRunner(modelPath: "llama.pte", tokenizerPath: "tiktoken.bin")  | 
 | 155 | +try runner.generate("Hello, how are you?", Config {  | 
 | 156 | +    $0.sequenceLength = 128  | 
 | 157 | +}) { token in  | 
 | 158 | +    print(token, terminator: "")  | 
 | 159 | +}  | 
 | 160 | +```  | 
39 | 161 | 
 
  | 
40 |  | -- **Portability:** Compatibility with a wide variety of computing platforms,  | 
41 |  | -  from high-end mobile phones to highly constrained embedded systems and  | 
42 |  | -  microcontrollers.  | 
43 |  | -- **Productivity:** Enabling developers to use the same toolchains and Developer  | 
44 |  | -  Tools from PyTorch model authoring and conversion, to debugging and deployment  | 
45 |  | -  to a wide variety of platforms.  | 
46 |  | -- **Performance:** Providing end users with a seamless and high-performance  | 
47 |  | -  experience due to a lightweight runtime and utilizing full hardware  | 
48 |  | -  capabilities such as CPUs, NPUs, and DSPs.  | 
 | 162 | +**Kotlin (Android)** — [API Docs](https://docs.pytorch.org/executorch/main/javadoc/org/pytorch/executorch/extension/llm/package-summary.html) • [Demo App](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo)  | 
 | 163 | +```kotlin  | 
 | 164 | +val llmModule = LlmModule("llama.pte", "tiktoken.bin", 0.8f)  | 
 | 165 | +llmModule.load()  | 
 | 166 | +llmModule.generate("Hello, how are you?", 128, object : LlmCallback {  | 
 | 167 | +    override fun onResult(result: String) { print(result) }  | 
 | 168 | +    override fun onStats(stats: String) { }  | 
 | 169 | +})  | 
 | 170 | +```  | 
49 | 171 | 
 
  | 
50 |  | -## Getting Started  | 
51 |  | -To get started you can:  | 
 | 172 | +For multimodal models (vision, audio), use the [MultiModal runner API](extension/llm/runner) which extends the LLM runner to handle image and audio inputs alongside text. See [Llava](examples/models/llava/README.md) and [Voxtral](examples/models/voxtral/README.md) examples.  | 
52 | 173 | 
 
  | 
53 |  | -- Visit the [Step by Step Tutorial](https://pytorch.org/executorch/stable/getting-started.html) to get things running locally and deploy a model to a device  | 
54 |  | -- Use this [Colab Notebook](https://colab.research.google.com/drive/1qpxrXC3YdJQzly3mRg-4ayYiOjC6rue3?usp=sharing) to start playing around right away  | 
55 |  | -- Jump straight into LLM use cases by following specific instructions for popular open-source models such as [Llama](examples/models/llama/README.md), [Qwen 3](examples/models/qwen3/README.md), [Phi-4-mini](examples/models/phi_4_mini/README.md), [Llava](examples/models/llava/README.md), [Voxtral](examples/models/voxtral/README.md), and [LFM2](examples/models/lfm2/README.md).  | 
 | 174 | +See [examples/models/llama](examples/models/llama/README.md) for complete workflow including quantization, mobile deployment, and advanced options.  | 
56 | 175 | 
 
  | 
57 |  | -## Feedback and Engagement  | 
 | 176 | +**Next Steps:**  | 
 | 177 | +- 📖 [Step-by-step tutorial](https://docs.pytorch.org/executorch/main/getting-started.html) — Complete walkthrough for your first model  | 
 | 178 | +- ⚡ [Colab notebook](https://colab.research.google.com/drive/1qpxrXC3YdJQzly3mRg-4ayYiOjC6rue3?usp=sharing) — Try ExecuTorch instantly in your browser  | 
 | 179 | +- 🤖 [Deploy Llama models](examples/models/llama/README.md) — LLM workflow with quantization and mobile demos  | 
58 | 180 | 
 
  | 
59 |  | -We welcome any feedback, suggestions, and bug reports from the community to help  | 
60 |  | -us improve our technology. Check out the [Discussion Board](https://github.com/pytorch/executorch/discussions) or chat real time with us on [Discord](https://discord.gg/Dh43CKSAdc)  | 
 | 181 | +## Platform & Hardware Support  | 
61 | 182 | 
 
  | 
62 |  | -## Contributing  | 
 | 183 | +| **Platform**     | **Supported Backends**                                   |  | 
 | 184 | +|------------------|----------------------------------------------------------|  | 
 | 185 | +| Android          | XNNPACK, Vulkan, Qualcomm, MediaTek, Samsung Exynos      |  | 
 | 186 | +| iOS              | XNNPACK, MPS, CoreML (Neural Engine)                     |  | 
 | 187 | +| Linux / Windows  | XNNPACK, OpenVINO, CUDA *(experimental)*                 |  | 
 | 188 | +| macOS            | XNNPACK, MPS, Metal *(experimental)*                     |  | 
 | 189 | +| Embedded / MCU   | XNNPACK, ARM Ethos-U, NXP, Cadence DSP                   |  | 
63 | 190 | 
 
  | 
64 |  | -We welcome contributions. To get started review the [guidelines](CONTRIBUTING.md) and chat with us on [Discord](https://discord.gg/Dh43CKSAdc)  | 
 | 191 | +See [Backend Documentation](https://docs.pytorch.org/executorch/main/backends-overview.html) for detailed hardware requirements and optimization guides.  | 
65 | 192 | 
 
  | 
 | 193 | +## Production Deployments  | 
66 | 194 | 
 
  | 
67 |  | -## Directory Structure  | 
 | 195 | +ExecuTorch powers on-device AI at scale across Meta's family of apps, VR/AR devices, and partner deployments. [View success stories →](https://docs.pytorch.org/executorch/main/success-stories.html)  | 
68 | 196 | 
 
  | 
69 |  | -Please refer to the [Codebase structure](CONTRIBUTING.md#codebase-structure) section of the [Contributing Guidelines](CONTRIBUTING.md) for more details.  | 
 | 197 | +## Examples & Models  | 
 | 198 | + | 
 | 199 | +**LLMs:** [Llama 3.2/3.1/3](examples/models/llama/README.md), [Qwen 3](examples/models/qwen3/README.md), [Phi-4-mini](examples/models/phi_4_mini/README.md), [LiquidAI LFM2](examples/models/lfm2/README.md)  | 
 | 200 | + | 
 | 201 | +**Multimodal:** [Llava](examples/models/llava/README.md) (vision-language), [Voxtral](examples/models/voxtral/README.md) (audio-language)  | 
 | 202 | + | 
 | 203 | +**Vision/Speech:** [MobileNetV2](https://github.com/meta-pytorch/executorch-examples/tree/main/mv2), [DeepLabV3](https://github.com/meta-pytorch/executorch-examples/tree/main/dl3)  | 
 | 204 | + | 
 | 205 | +**Resources:** [`examples/`](examples/) directory • [executorch-examples](https://github.com/meta-pytorch/executorch-examples) mobile demos • [Optimum-ExecuTorch](https://github.com/huggingface/optimum-executorch) for HuggingFace models  | 
 | 206 | + | 
 | 207 | +## Key Features  | 
 | 208 | + | 
 | 209 | +ExecuTorch provides advanced capabilities for production deployment:  | 
 | 210 | + | 
 | 211 | +- **Quantization** — Built-in support via [torchao](https://docs.pytorch.org/ao) for 8-bit, 4-bit, and dynamic quantization  | 
 | 212 | +- **Memory Planning** — Optimize memory usage with ahead-of-time allocation strategies  | 
 | 213 | +- **Developer Tools** — ETDump profiler, ETRecord inspector, and model debugger  | 
 | 214 | +- **Selective Build** — Strip unused operators to minimize binary size  | 
 | 215 | +- **Custom Operators** — Extend with domain-specific kernels  | 
 | 216 | +- **Dynamic Shapes** — Support variable input sizes with bounded ranges  | 
 | 217 | + | 
 | 218 | +See [Advanced Topics](https://docs.pytorch.org/executorch/main/advanced-topics-section.html) for quantization techniques, custom backends, and compiler passes.  | 
 | 219 | + | 
 | 220 | +## Documentation  | 
 | 221 | + | 
 | 222 | +- [**Documentation Home**](https://docs.pytorch.org/executorch/main/index.html) — Complete guides and tutorials  | 
 | 223 | +- [**API Reference**](https://docs.pytorch.org/executorch/main/api-section.html) — Python, C++, Java/Kotlin APIs  | 
 | 224 | +- [**Backend Integration**](https://docs.pytorch.org/executorch/main/backend-delegates-integration.html) — Build custom hardware backends  | 
 | 225 | +- [**Troubleshooting**](https://docs.pytorch.org/executorch/main/using-executorch-troubleshooting.html) — Common issues and solutions  | 
 | 226 | + | 
 | 227 | +## Community & Contributing  | 
 | 228 | + | 
 | 229 | +We welcome contributions from the community!  | 
 | 230 | + | 
 | 231 | +- 💬 [**GitHub Discussions**](https://github.com/pytorch/executorch/discussions) — Ask questions and share ideas  | 
 | 232 | +- 🎮 [**Discord**](https://discord.gg/Dh43CKSAdc) — Chat with the team and community  | 
 | 233 | +- 🐛 [**Issues**](https://github.com/pytorch/executorch/issues) — Report bugs or request features  | 
 | 234 | +- 🤝 [**Contributing Guide**](CONTRIBUTING.md) — Guidelines and codebase structure  | 
70 | 235 | 
 
  | 
71 | 236 | ## License  | 
72 |  | -ExecuTorch is BSD licensed, as found in the LICENSE file.  | 
 | 237 | + | 
 | 238 | +ExecuTorch is BSD licensed, as found in the [LICENSE](LICENSE) file.  | 
 | 239 | + | 
 | 240 | +<br><br>  | 
 | 241 | + | 
 | 242 | +---  | 
 | 243 | + | 
 | 244 | +<div align="center">  | 
 | 245 | +  <p><strong>Part of the PyTorch ecosystem</strong></p>  | 
 | 246 | +  <p>  | 
 | 247 | +    <a href="https://github.com/pytorch/executorch">GitHub</a> •  | 
 | 248 | +    <a href="https://docs.pytorch.org/executorch">Documentation</a>  | 
 | 249 | +  </p>  | 
 | 250 | +</div>  | 
0 commit comments