You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/success-stories.md
+93-21Lines changed: 93 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,51 +6,123 @@ Discover how organizations are leveraging ExecuTorch to deploy AI models at scal
6
6
7
7
---
8
8
9
-
## 🎯 Featured Success Stories
9
+
## Featured Success Stories
10
10
11
11
::::{grid} 1
12
12
:gutter: 3
13
13
14
-
:::{grid-item-card} **🚀 Story 1: [Title Placeholder]**
14
+
:::{grid-item-card} **Meta's Family of Apps**
15
15
:class-header: bg-primary text-white
16
16
17
-
**Industry:**[Industry]
18
-
**Hardware:**[Hardware Platform]
19
-
**Impact:**[Key Metrics]
17
+
**Industry:**Social Media & Messaging
18
+
**Hardware:**Android & iOS Devices
19
+
**Impact:**Billions of users, latency reduction
20
20
21
-
[Placeholder Description] - Brief overview of the challenge, solution, and results achieved.
21
+
Powers Instagram, WhatsApp, Facebook, and Messenger with real-time on-device AI for content ranking, recommendations, and privacy-preserving features at scale.
22
22
23
-
24
-
[Read Full Story →](#story-1-details)
23
+
[Read Blog →](https://engineering.fb.com/2025/07/28/android/executorch-on-device-ml-meta-family-of-apps/)
25
24
:::
26
25
27
-
:::{grid-item-card} **⚡ Story 2: [Title Placeholder]**
Liquid AI builds foundation models that make AI work where the cloud can't. In its LFM2 series, the team uses PyTorch ExecuTorch within the LEAP Edge SDK to deploy high-performance multimodal models efficiently across devices. ExecuTorch provides the flexibility to support custom architectures and processing pipelines while reducing inference latency through graph optimization and caching. Together, they enable faster, more efficient, privacy-preserving AI that runs entirely on the edge.
43
+
44
+
[Read Blog →](https://www.liquid.ai/blog/how-liquid-ai-uses-executorch-to-power-efficient-flexible-on-device-intelligence)
39
45
:::
40
46
41
-
:::{grid-item-card} **🧠 Story 3: [Title Placeholder]**
42
-
:class-header: bg-info text-white
47
+
:::{grid-item-card} **PrivateMind: Complete Privacy with On-Device AI**
48
+
:class-header: bg-warning text-white
49
+
50
+
**Industry:** Privacy & Personal Computing
51
+
**Hardware:** iOS & Android Devices
52
+
**Impact:** 100% on-device processing
53
+
54
+
PrivateMind delivers a fully private AI assistant using ExecuTorch's .pte format. Built with React Native ExecuTorch, it supports LLaMA, Qwen, Phi-4, and custom models with offline speech-to-text and PDF chat capabilities.
55
+
56
+
[Visit →](https://privatemind.swmansion.com)
57
+
:::
58
+
59
+
:::{grid-item-card} **NimbleEdge: On-Device Agentic AI Platform**
60
+
:class-header: bg-danger text-white
61
+
62
+
**Industry:** AI Infrastructure
63
+
**Hardware:** iOS & Android Devices
64
+
**Impact:** 30% higher TPS on iOS, faster time-to-market with Qwen/Gemma models
65
+
66
+
NimbleEdge successfully integrated ExecuTorch with its open-source DeliteAI platform to enable agentic workflows orchestrated in Python on mobile devices. The extensible ExecuTorch ecosystem allowed implementation of on-device optimization techniques leveraging contextual sparsity. ExecuTorch significantly accelerated the release of "NimbleEdge AI" for iOS, enabling models like Qwen 2.5 with tool calling support and achieving up to 30% higher transactions per second.
PyTorch-native quantization and optimization library for preparing efficient models for ExecuTorch deployment.
100
+
101
+
[Blog →](https://pytorch.org/blog/torchao-quantized-models-and-quantization-recipes-now-available-on-huggingface-hub/) • [Qwen Example →](https://huggingface.co/pytorch/Qwen3-4B-INT8-INT4) • [Phi Example →](https://huggingface.co/pytorch/Phi-4-mini-instruct-INT8-INT4)
102
+
:::
103
+
104
+
:::{grid-item-card} **Unsloth**
105
+
:class-header: bg-secondary text-white
106
+
107
+
Optimize LLM fine-tuning with faster training and reduced VRAM usage, then deploy efficiently with ExecuTorch.
108
+
109
+
[Example Model →](https://huggingface.co/metascroy/Llama-3.2-1B-Instruct-int8-int4)
52
110
:::
53
111
54
112
::::
55
113
56
114
---
115
+
116
+
## Featured Demos
117
+
118
+
-**Text and Multimodal LLM demo mobile apps** - Text (Llama, Qwen3, Phi-4) and multimodal (Gemma3, Voxtral) mobile demo apps. [Try →](https://github.com/meta-pytorch/executorch-examples/tree/main/llm)
119
+
120
+
-**Voxtral** - Deploy audio-text-input LLM on CPU (via XNNPACK) and on CUDA. [Try →](https://github.com/pytorch/executorch/blob/main/examples/models/voxtral/README.md)
121
+
122
+
-**LoRA adapter** - Export two LoRA adapters that share a single foundation weight file, saving memory and disk space. [Try →](https://github.com/meta-pytorch/executorch-examples/tree/main/program-data-separation/cpp/lora_example)
123
+
124
+
-**OpenVINO from Intel** - Deploy [Yolo12](https://github.com/pytorch/executorch/tree/main/examples/models/yolo12), [Llama](https://github.com/pytorch/executorch/tree/main/examples/openvino/llama), and [Stable Diffusion](https://github.com/pytorch/executorch/tree/main/examples/openvino/stable_diffusion) on [OpenVINO from Intel](https://www.intel.com/content/www/us/en/developer/articles/community/optimizing-executorch-on-ai-pcs.html).
125
+
126
+
-**Demo title** - Brief description of the demo [Try →](#)
127
+
128
+
*Want to showcase your demo? [Submit here →](https://github.com/pytorch/executorch/issues)*
0 commit comments