diff --git a/website/index.html b/website/index.html new file mode 100644 index 00000000000..1235f4cf215 --- /dev/null +++ b/website/index.html @@ -0,0 +1,1753 @@ + + +
+ + +Deploy PyTorch models directly to edge devices. Text, vision, and audio AI with privacy-preserving, real-time inferenceβno cloud required.
+ +The future of AI is at the edge, where privacy meets performance
+ +Data never leaves the device. Process personal content, conversations, and media locally without cloud exposure.
+Instant inference with no network round-trips. Perfect for AR/VR experiences, multimodal AI interactions, and responsive conversational agents.
+Zero network dependency for inference. Works seamlessly in low-bandwidth regions, remote areas, or completely offline.
+No cloud compute bills. No API rate limits. Scale to billions of users without infrastructure costs growing linearly.
+The convergence of efficient architectures and edge hardware creates new opportunities
+ ++ The opportunity is now: Foundation models have crossed the efficiency threshold. + Deploy sophisticated AI directly where data lives. +
+The technical challenges that made edge deployment complex... until now
+ +From battery-powered phones to energy-harvesting sensors, edge devices have strict power budgets. Microcontrollers may run on milliwatts, requiring extreme efficiency.
+Sustained inference generates heat without active cooling. From smartphones to industrial IoT devices, thermal throttling limits continuous AI workloads.
+Edge devices range from high-end phones to tiny microcontrollers. Beyond capacity, limited memory bandwidth creates bottlenecks when moving tensors between compute units.
+From microcontrollers to smartphone NPUs to embedded GPUs. Each architecture demands unique optimizations, making broad deployment across diverse form factors extremely challenging.
+But deploying PyTorch models to edge devices meant losing everything that made PyTorch great
+ +PyTorch's intuitive APIs and eager execution power breakthrough research
+Multiple intermediate formats, custom runtimes, C++ rewrites
+PyTorch operations don't map 1:1 to other formats
+Can't trace errors back to original PyTorch code
+Locked into proprietary formats with limited operator support
+Teams spend months rewriting Python models in C++ for production
+Direct export from PyTorch to edge. Core ATen operators preserved. No intermediate formats, no vendor lock-in.
+Optimize models offline for target device capabilities. Hardware-specific performance tuning before deployment.
+Pick and choose optimization steps. Composable at both compile-time and runtime for maximum flexibility.
+Fully open source with hardware partner contributions. Built on PyTorch's standardized IR and operator set.
+Portable C++ runtime runs on microcontrollers to smartphones.
+Native integration with PyTorch ecosystem, including torchao for quantization. Stay in familiar tools throughout.
+Export, optimize, and run PyTorch models on edge devices with just a few lines of code
+ +import torch
+from torch.export import export
+
+# Your existing PyTorch model
+model = MyModel().eval()
+example_inputs = (torch.randn(1, 3, 224, 224),)
+
+# Creates semantically equivalent graph representation
+exported_program = export(model, example_inputs)
+ + Switch between backends with a single line change +
+ +from executorch.exir import to_edge_transform_and_lower
+from executorch.backends.xnnpack import XnnpackPartitioner
+
+program = to_edge_transform_and_lower(
+ exported_program,
+ partitioner=[XnnpackPartitioner()]
+).to_executorch()
+ from executorch.exir import to_edge_transform_and_lower
+from executorch.backends.apple import CoreMLPartitioner
+
+program = to_edge_transform_and_lower(
+ exported_program,
+ partitioner=[CoreMLPartitioner()]
+).to_executorch()
+ from executorch.exir import to_edge_transform_and_lower
+from executorch.backends.qualcomm import QnnPartitioner
+
+program = to_edge_transform_and_lower(
+ exported_program,
+ partitioner=[QnnPartitioner()]
+).to_executorch()
+ # Save to .pte file
+with open("model.pte", "wb") as f:
+ f.write(program.buffer)
+ // Load and execute model
+auto module = Module("model.pte");
+auto method = module.load_method("forward");
+auto outputs = method.execute({input_tensor});
+// Access result tensors
+auto result = outputs[0].toTensor();
+ // Initialize ExecuTorch module
+let module = try ETModule(path: "model.pte")
+// Run inference with tensors
+let outputs = try module.forward([inputTensor])
+// Process results
+let result = outputs[0]
+ // Load model from assets
+val module = Module.load(assetFilePath("model.pte"))
+// Execute with tensor input
+val outputs = module.forward(inputTensor)
+// Extract prediction results
+val prediction = outputs[0].dataAsFloatArray
+ // Initialize ExecuTorch module
+ETModule *module = [[ETModule alloc] initWithPath:@"model.pte" error:nil];
+// Run inference with tensors
+NSArray *outputs = [module forwardWithInputs:@[inputTensor] error:nil];
+// Process results
+ETTensor *result = outputs[0];
+ // Load model from ArrayBuffer
+const module = et.Module.load(buffer);
+// Create input tensor from data
+const inputTensor = et.Tensor.fromIter(tensorData, shape);
+// Run inference
+const output = module.forward(inputTensor);
+ + Available on Android, iOS, Linux, Windows, macOS, and embedded microcontrollers (e.g., DSP and Cortex-M processors) +
++ Need advanced features? ExecuTorch supports memory planning, quantization, profiling, and custom compiler passes. +
+ + Try the Full Tutorial β + +Run complex multimodal LLMs with simplified C++ interfaces
+ ++ Choose your platform to see the multimodal API supporting text, images, and audio: +
+ ++ High-level APIs abstract away model complexity - just load, prompt, and get results +
+ + Explore LLM APIs β + +Hardware acceleration contributed by industry partners via open source
+ +CPU acceleration across ARM and x86 architectures
+Neural Engine and Apple Silicon optimization
+Hexagon NPU support
+Microcontroller NPU for ultra-low power
+Cross-platform graphics acceleration
+x86 CPU and integrated GPU optimization
+Dimensity chipset acceleration
+Integrated NPU optimization
+Automotive and IoT acceleration
+Metal Performance Shaders for GPU acceleration
+Versatile graphics framework support
+Digital signal processor optimization
+Production deployments and strategic partnerships accelerating edge AI
+ +Join thousands of developers using ExecuTorch in production
+ Get Started Today +