diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 32681cdb08f..a96a5fe00d3 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -5,10 +5,10 @@ it easy to contribute to this project.
 ## Dev Install
 
 Set up your environment by following the instructions at
-https://pytorch.org/executorch/stable/getting-started-setup.html to clone
+https://pytorch.org/executorch/main/getting-started-setup to clone
 the repo and install the necessary requirements.
 
-Refer to this [document](https://pytorch.org/executorch/main/using-executorch-building-from-source.html) to build ExecuTorch from source.
+Refer to this [document](docs/source/using-executorch-building-from-source.md) to build ExecuTorch from source.
 
 ### Dev Setup for Android
 For Android, please refer to the [Android documentation](docs/source/using-executorch-android.md).
@@ -40,8 +40,8 @@ executorch
 ├── <a href="devtools">devtools</a> - Model profiling, debugging, and inspection. Please refer to the <a href="docs/source/devtools-overview.md">tools documentation</a> for more information.
 │   ├── <a href="devtools/bundled_program">bundled_program</a> - a tool for validating ExecuTorch model. See <a href="docs/source/bundled-io.md">doc</a>.
 │   ├── <a href="devtools/etdump">etdump</a> - ETDump - a format for saving profiling and debugging data from runtime. See <a href="docs/source/etdump.md">doc</a>.
-│   ├── <a href="devtools/etrecord">etrecord</a> - ETRecord - AOT debug artifact for ExecuTorch. See <a href="https://pytorch.org/executorch/main/etrecord.html">doc</a>.
-│   ├── <a href="devtools/inspector">inspector</a> - Python API to inspect ETDump and ETRecord. See <a href="https://pytorch.org/executorch/main/model-inspector.html">doc</a>.
+│   ├── <a href="devtools/etrecord">etrecord</a> - ETRecord - AOT debug artifact for ExecuTorch. See <a href="https://pytorch.org/executorch/main/etrecord">doc</a>.
+│   ├── <a href="devtools/inspector">inspector</a> - Python API to inspect ETDump and ETRecord. See <a href="https://pytorch.org/executorch/main/model-inspector">doc</a>.
 │   └── <a href="devtools/visualization">visualization</a> - Visualization tools for representing model structure and performance metrics.
 ├── <a href="docs">docs</a> - Static docs tooling and documentation source files.
 ├── <a href="examples">examples</a> - Examples of various user flows, such as model export, delegates, and runtime execution.
@@ -57,8 +57,8 @@ executorch
 │   ├── <a href="exir/serde">serde</a> - Graph module serialization/deserialization.
 │   ├── <a href="exir/verification">verification</a> - IR verification.
 ├── <a href="extension">extension</a> - Extensions built on top of the runtime.
-│   ├── <a href="extension/android">android</a> - ExecuTorch wrappers for Android apps. Please refer to the <a href="docs/source/using-executorch-android.md">Android documentation</a> and <a href="https://pytorch.org/executorch/main/javadoc/">Javadoc</a> for more information.
-│   ├── <a href="extension/apple">apple</a> - ExecuTorch wrappers for iOS apps. Please refer to the <a href="docs/source/using-executorch-ios.md">iOS documentation</a> and <a href="https://pytorch.org/executorch/main/using-executorch-ios.html">how to integrate into Apple platform</a> for more information.
+│   ├── <a href="extension/android">android</a> - ExecuTorch wrappers for Android apps. Please refer to the <a href="docs/source/using-executorch-android.md">Android documentation</a> and <a href="https://pytorch.org/executorch/main/javadoc">Javadoc</a> for more information.
+│   ├── <a href="extension/apple">apple</a> - ExecuTorch wrappers for iOS apps. Please refer to the <a href="docs/source/using-executorch-ios.md">iOS documentation</a> on how to integrate into Apple platform</a> for more information.
 │   ├── <a href="extension/aten_util">aten_util</a> - Converts to and from PyTorch ATen types.
 │   ├── <a href="extension/data_loader">data_loader</a> - 1st party data loader implementations.
 │   ├── <a href="extension/evalue_util">evalue_util</a> - Helpers for working with EValue objects.
@@ -68,10 +68,10 @@ executorch
 │   ├── <a href="extension/memory_allocator">memory_allocator</a> - 1st party memory allocator implementations.
 │   ├── <a href="extension/module">module</a> - A simplified C++ wrapper for the runtime. An abstraction that deserializes and executes an ExecuTorch artifact (.pte file). Refer to the <a href="docs/source/extension-module.md">module documentation</a> for more information.
 │   ├── <a href="extension/parallel">parallel</a> - C++ threadpool integration.
-│   ├── <a href="extension/pybindings">pybindings</a> - Python API for executorch runtime. This is powering up the <a href="https://pytorch.org/executorch/main/runtime-python-api-reference.html">runtime Python API</a> for ExecuTorch.
+│   ├── <a href="extension/pybindings">pybindings</a> - Python API for executorch runtime. This is powering up the <a href="docs/source/runtime-python-api-reference.md">runtime Python API</a> for ExecuTorch.
 │   ├── <a href="extension/pytree">pytree</a> - C++ and Python flattening and unflattening lib for pytrees.
 │   ├── <a href="extension/runner_util">runner_util</a> - Helpers for writing C++ PTE-execution tools.
-│   ├── <a href="extension/tensor">tensor</a> - Tensor maker and <code>TensorPtr</code>, details in <a href="/docs/source/extension-tensor.md">this documentation</a>. For how to use <code>TensorPtr</code> and <code>Module</code>, please refer to the <a href="/docs/source/using-executorch-cpp.md">"Using ExecuTorch with C++"</a> doc.
+│   ├── <a href="extension/tensor">tensor</a> - Tensor maker and <code>TensorPtr</code>, details in <a href="docs/source/extension-tensor.md">this documentation</a>. For how to use <code>TensorPtr</code> and <code>Module</code>, please refer to the <a href="docs/source/using-executorch-cpp.md">"Using ExecuTorch with C++"</a> doc.
 │   ├── <a href="extension/testing_util">testing_util</a> - Helpers for writing C++ tests.
 │   ├── <a href="extension/threadpool">threadpool</a> - Threadpool.
 │   └── <a href="extension/training">training</a> - Experimental libraries for on-device training.
@@ -85,7 +85,7 @@ executorch
 ├── <a href="runtime">runtime</a> - Core C++ runtime. These components are used to execute the ExecuTorch program. Please refer to the <a href="docs/source/runtime-overview.md">runtime documentation</a> for more information.
 │   ├── <a href="runtime/backend">backend</a> - Backend delegate runtime APIs.
 │   ├── <a href="runtime/core">core</a> - Core structures used across all levels of the runtime. Basic components such as <code>Tensor</code>, <code>EValue</code>, <code>Error</code> and <code>Result</code> etc.
-│   ├── <a href="runtime/executor">executor</a> - Model loading, initialization, and execution. Runtime components that execute the ExecuTorch program, such as <code>Program</code>, <code>Method</code>. Refer to the <a href="https://pytorch.org/executorch/main/executorch-runtime-api-reference.html">runtime API documentation</a> for more information.
+│   ├── <a href="runtime/executor">executor</a> - Model loading, initialization, and execution. Runtime components that execute the ExecuTorch program, such as <code>Program</code>, <code>Method</code>. Refer to the <a href="https://pytorch.org/executorch/main/executorch-runtime-api-reference">runtime API documentation</a> for more information.
 │   ├── <a href="runtime/kernel">kernel</a> - Kernel registration and management.
 │   └── <a href="runtime/platform">platform</a> - Layer between architecture specific code and portable C++.
 ├── <a href="schema">schema</a> - ExecuTorch PTE file format flatbuffer schemas.
@@ -102,7 +102,7 @@ executorch
 ## Contributing workflow
 We actively welcome your pull requests (PRs).
 
-If you're completely new to open-source projects, GitHub, or ExecuTorch, please see our [New Contributor Guide](./docs/source/new-contributor-guide.md) for a step-by-step walkthrough on making your first contribution. Otherwise, read on.
+If you're completely new to open-source projects, GitHub, or ExecuTorch, please see our [New Contributor Guide](docs/source/new-contributor-guide.md) for a step-by-step walkthrough on making your first contribution. Otherwise, read on.
 
 1. [Claim an issue](#claiming-issues), if present, before starting work. If an
    issue doesn't cover the work you plan to do, consider creating one to provide
@@ -245,7 +245,7 @@ modifications to the Google C++ style guide.
 
 ### C++ Portability Guidelines
 
-See also [Portable C++ Programming](/docs/source/portable-cpp-programming.md)
+See also [Portable C++ Programming](docs/source/portable-cpp-programming.md)
 for detailed advice.
 
 #### C++ language version
@@ -417,9 +417,9 @@ for basics.
 
 ## For Backend Delegate Authors
 
-- Use [this](/docs/source/backend-delegates-integration.md) guide when
+- Use [this](docs/source/backend-delegates-integration.md) guide when
   integrating your delegate with ExecuTorch.
-- Refer to [this](/docs/source/backend-delegates-dependencies.md) set of
+- Refer to [this](docs/source/backend-delegates-dependencies.md) set of
   guidelines when including a third-party depenency for your delegate.
 
 &nbsp;
diff --git a/Package.swift b/Package.swift
index 1322b918c07..b8a8b7d064b 100644
--- a/Package.swift
+++ b/Package.swift
@@ -15,7 +15,7 @@
 //
 // For details on building frameworks locally or using prebuilt binaries,
 // see the documentation:
-// https://pytorch.org/executorch/main/using-executorch-ios.html
+// https://pytorch.org/executorch/main/using-executorch-ios
 
 import PackageDescription
 
diff --git a/README-wheel.md b/README-wheel.md
index 9f074ab5ee3..12752bcabfa 100644
--- a/README-wheel.md
+++ b/README-wheel.md
@@ -10,32 +10,21 @@ The `executorch` pip package is in beta.
 
 The prebuilt `executorch.runtime` module included in this package provides a way
 to run ExecuTorch `.pte` files, with some restrictions:
-* Only [core ATen
-  operators](https://pytorch.org/executorch/stable/ir-ops-set-definition.html)
-  are linked into the prebuilt module
-* Only the [XNNPACK backend
-  delegate](https://pytorch.org/executorch/main/native-delegates-executorch-xnnpack-delegate.html)
-  is linked into the prebuilt module.
-* \[macOS only] [Core ML](https://pytorch.org/executorch/main/build-run-coreml.html)
-  and [MPS](https://pytorch.org/executorch/main/build-run-mps.html) backend
-  delegates are also linked into the prebuilt module.
+* Only [core ATen operators](docs/source/ir-ops-set-definition.md) are linked into the prebuilt module
+* Only the [XNNPACK backend delegate](docs/source/backends-xnnpack.md) is linked into the prebuilt module.
+* \[macOS only] [Core ML](docs/source/backends-coreml.md) and [MPS](docs/source/backends-mps.md) backend
+  are also linked into the prebuilt module.
 
-Please visit the [ExecuTorch website](https://pytorch.org/executorch/) for
+Please visit the [ExecuTorch website](https://pytorch.org/executorch) for
 tutorials and documentation. Here are some starting points:
-* [Getting
-  Started](https://pytorch.org/executorch/stable/getting-started-setup.html)
+* [Getting Started](https://pytorch.org/executorch/main/getting-started-setup)
   * Set up the ExecuTorch environment and run PyTorch models locally.
-* [Working with
-  local LLMs](https://pytorch.org/executorch/stable/llm/getting-started.html)
+* [Working with local LLMs](docs/source/llm/getting-started.md)
   * Learn how to use ExecuTorch to export and accelerate a large-language model
     from scratch.
-* [Exporting to
-  ExecuTorch](https://pytorch.org/executorch/main/tutorials/export-to-executorch-tutorial.html)
+* [Exporting to ExecuTorch](https://pytorch.org/executorch/main/tutorials/export-to-executorch-tutorial)
   * Learn the fundamentals of exporting a PyTorch `nn.Module` to ExecuTorch, and
     optimizing its performance using quantization and hardware delegation.
-* Running LLaMA on
-  [iOS](https://pytorch.org/executorch/stable/llm/llama-demo-ios.html) and
-  [Android](https://pytorch.org/executorch/stable/llm/llama-demo-android.html)
-  devices.
+* Running LLaMA on [iOS](docs/source/llm/llama-demo-ios) and [Android](docs/source/llm/llama-demo-android) devices.
   * Build and run LLaMA in a demo mobile app, and learn how to integrate models
     with your own apps.
diff --git a/README.md b/README.md
index 025a8780739..c0d594e7733 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,5 @@
 <div align="center">
-  <img src="./docs/source/_static/img/et-logo.png" alt="Logo" width="200">
+  <img src="docs/source/_static/img/et-logo.png" alt="Logo" width="200">
   <h1 align="center">ExecuTorch: A powerful on-device AI Framework</h1>
 </div>
 
@@ -8,7 +8,7 @@
   <a href="https://github.com/pytorch/executorch/graphs/contributors"><img src="https://img.shields.io/github/contributors/pytorch/executorch?style=for-the-badge&color=blue" alt="Contributors"></a>
   <a href="https://github.com/pytorch/executorch/stargazers"><img src="https://img.shields.io/github/stars/pytorch/executorch?style=for-the-badge&color=blue" alt="Stargazers"></a>
   <a href="https://discord.gg/Dh43CKSAdc"><img src="https://img.shields.io/badge/Discord-Join%20Us-purple?logo=discord&logoColor=white&style=for-the-badge" alt="Join our Discord community"></a>
-  <a href="https://pytorch.org/executorch/stable/index.html"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation"></a>
+  <a href="https://pytorch.org/executorch/main/index"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation"></a>
   <hr>
 </div>
 
@@ -49,9 +49,9 @@ Key value propositions of ExecuTorch are:
 ## Getting Started
 To get started you can:
 
-- Visit the [Step by Step Tutorial](https://pytorch.org/executorch/main/index.html) to get things running locally and deploy a model to a device
-- Use this [Colab Notebook](https://pytorch.org/executorch/stable/getting-started-setup.html#quick-setup-colab-jupyter-notebook-prototype) to start playing around right away
-- Jump straight into LLM use cases by following specific instructions for [Llama](./examples/models/llama/README.md) and [Llava](./examples/models/llava/README.md)
+- Visit the [Step by Step Tutorial](https://pytorch.org/executorch/main/index) to get things running locally and deploy a model to a device
+- Use this [Colab Notebook](https://pytorch.org/executorch/main/getting-started-setup#quick-setup-colab-jupyter-notebook-prototype) to start playing around right away
+- Jump straight into LLM use cases by following specific instructions for [Llama](examples/models/llama/README.md) and [Llava](examples/models/llava/README.md)
 
 ## Feedback and Engagement
 
diff --git a/backends/apple/mps/setup.md b/backends/apple/mps/setup.md
index 5c14ad673df..f0c4c378e3f 100644
--- a/backends/apple/mps/setup.md
+++ b/backends/apple/mps/setup.md
@@ -40,7 +40,7 @@ In order to be able to successfully build and run a model using the MPS backend
 
 ## Setting up Developer Environment
 
-***Step 1.*** Please finish tutorial [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup).
+***Step 1.*** Please finish tutorial [Setting up ExecuTorch](https://pytorch.org/executorch/main/getting-started-setup).
 
 ***Step 2.*** Install dependencies needed to lower MPS delegate:
 
diff --git a/backends/mediatek/README.md b/backends/mediatek/README.md
index ec4c392eb46..0a756a7bf1a 100644
--- a/backends/mediatek/README.md
+++ b/backends/mediatek/README.md
@@ -43,7 +43,7 @@ Download [NeuroPilot Express SDK](https://neuropilot.mediatek.com/resources/publ
 
 Follow the steps below to setup your build environment:
 
-1. **Setup ExecuTorch Environment**: Refer to the [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup) guide for detailed instructions on setting up the ExecuTorch environment.
+1. **Setup ExecuTorch Environment**: Refer to the [Setting up ExecuTorch](https://pytorch.org/executorch/main/getting-started-setup) guide for detailed instructions on setting up the ExecuTorch environment.
 
 2. **Setup MediaTek Backend Environment**
 - Install the dependent libs. Ensure that you are inside backends/mediatek/ directory
diff --git a/backends/openvino/README.md b/backends/openvino/README.md
index 95a5f4c364e..b79689c2b56 100644
--- a/backends/openvino/README.md
+++ b/backends/openvino/README.md
@@ -63,7 +63,7 @@ For more information about OpenVINO build, refer to the [OpenVINO Build Instruct
 
 Follow the steps below to setup your build environment:
 
-1. **Setup ExecuTorch Environment**: Refer to the [Environment Setup](https://pytorch.org/executorch/stable/getting-started-setup#environment-setup) guide for detailed instructions on setting up the ExecuTorch environment.
+1. **Setup ExecuTorch Environment**: Refer to the [Environment Setup](https://pytorch.org/executorch/main/getting-started-setup#environment-setup) guide for detailed instructions on setting up the ExecuTorch environment.
 
 2. **Setup OpenVINO Backend Environment**
 - Install the dependent libs. Ensure that you are inside `executorch/backends/openvino/` directory
diff --git a/backends/qualcomm/README.md b/backends/qualcomm/README.md
index 85019add313..5cde568957e 100644
--- a/backends/qualcomm/README.md
+++ b/backends/qualcomm/README.md
@@ -8,7 +8,7 @@ This backend is implemented on the top of
 [Qualcomm AI Engine Direct SDK](https://developer.qualcomm.com/software/qualcomm-ai-engine-direct-sdk).
 Please follow [tutorial](../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md) to setup environment, build, and run executorch models by this backend (Qualcomm AI Engine Direct is also referred to as QNN in the source and documentation).
 
-A website version of the tutorial is [here](https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html).
+A website version of the tutorial is [here](https://pytorch.org/executorch/main/build-run-qualcomm-ai-engine-direct-backend).
 
 ## Delegate Options
 
diff --git a/docs/source/_templates/layout.html b/docs/source/_templates/layout.html
index 210153e123c..155ae5eaf06 100644
--- a/docs/source/_templates/layout.html
+++ b/docs/source/_templates/layout.html
@@ -131,14 +131,14 @@
     $(".main-menu a:contains('GitHub')").each(overwrite);
     // Overwrite link to Tutorials and Get Started top navigation. If these sections are moved
     // this overrides need to be updated.
-    $(".main-menu a:contains('Tutorials')").attr("href", "https://pytorch.org/executorch/stable/index.html#tutorials-and-examples");
-    $(".main-menu a:contains('Get Started')").attr("href", "https://pytorch.org/executorch/stable/getting-started-setup.html");
+    $(".main-menu a:contains('Tutorials')").attr("href", "https://pytorch.org/executorch/main/index#tutorials-and-examples");
+    $(".main-menu a:contains('Get Started')").attr("href", "https://pytorch.org/executorch/main/getting-started-setup");
     // Mobile
     $(".mobile-menu a:contains('Github')").each(overwrite);
     // Overwrite link to Tutorials and Get Started top navigation. If these sections are moved
     // this overrides need to be updated.
-    $(".mobile-menu a:contains('Tutorials')").attr("href", "https://pytorch.org/executorch/stable/index.html#tutorials-and-examples");
-    $(".mobile-menu a:contains('Get Started')").attr("href", "https://pytorch.org/executorch/stable/getting-started-setup.html");
+    $(".mobile-menu a:contains('Tutorials')").attr("href", "https://pytorch.org/executorch/main/index#tutorials-and-examples");
+    $(".mobile-menu a:contains('Get Started')").attr("href", "https://pytorch.org/executorch/main/getting-started-setup");
 
   });
 </script>
diff --git a/examples/README.md b/examples/README.md
index 17999b15423..b6a4e0a0472 100644
--- a/examples/README.md
+++ b/examples/README.md
@@ -9,7 +9,7 @@ ExecuTorch's extensive support spans from simple modules like "Add" to comprehen
 ## Directory structure
 ```
 examples
-├── llm_manual                        # A storage place for the files that [LLM Maunal](https://pytorch.org/executorch/main/llm/getting-started.html) needs
+├── llm_manual                        # A storage place for the files that [LLM Maunal](https://pytorch.org/executorch/main/llm/getting-started) needs
 ├── models                            # Contains a set of popular and representative PyTorch models
 ├── portable                          # Contains end-to-end demos for ExecuTorch in portable mode
 ├── selective_build                   # Contains demos of selective build for optimizing the binary size of the ExecuTorch runtime
@@ -75,7 +75,7 @@ The [`Cadence/`](./cadence) directory hosts a demo that showcases the process of
 
 ## Dependencies
 
-Various models and workflows listed in this directory have dependencies on some other packages. You need to follow the setup guide in [Setting up ExecuTorch from GitHub](https://pytorch.org/executorch/stable/getting-started-setup) to have appropriate packages installed.
+Various models and workflows listed in this directory have dependencies on some other packages. You need to follow the setup guide in [Setting up ExecuTorch from GitHub](https://pytorch.org/executorch/main/getting-started-setup) to have appropriate packages installed.
 
 # Disclaimer
 
diff --git a/examples/apple/coreml/README.md b/examples/apple/coreml/README.md
index 4dba5031358..a4234e72c2d 100644
--- a/examples/apple/coreml/README.md
+++ b/examples/apple/coreml/README.md
@@ -15,7 +15,7 @@ coreml
 
 We will walk through an example model to generate a Core ML delegated binary file from a python `torch.nn.module` then we will use the `coreml_executor_runner` to run the exported binary file.
 
-1. Following the setup guide in [Setting Up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup)
+1. Following the setup guide in [Setting Up ExecuTorch](https://pytorch.org/executorch/main/getting-started-setup)
 you should be able to get the basic development environment for ExecuTorch working.
 
 
diff --git a/examples/apple/mps/README.md b/examples/apple/mps/README.md
index dc01d585f84..2eafb86f1c6 100644
--- a/examples/apple/mps/README.md
+++ b/examples/apple/mps/README.md
@@ -8,7 +8,7 @@ This README gives some examples on backend-specific model workflow.
 ## Prerequisite
 
 Please finish the following tutorials:
-- [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup).
+- [Setting up ExecuTorch](https://pytorch.org/executorch/main/getting-started-setup).
 - [Setting up MPS backend](../../../backends/apple/mps/setup.md).
 
 ## Delegation to MPS backend
diff --git a/examples/arm/README.md b/examples/arm/README.md
index 8762e7ccdd1..25f2fd5a316 100644
--- a/examples/arm/README.md
+++ b/examples/arm/README.md
@@ -44,6 +44,6 @@ jupyter notebook ethos_u_minimal_example.ipynb
 
 ### Online Tutorial
 
-We also have a [tutorial](https://pytorch.org/executorch/stable/executorch-arm-delegate-tutorial.html) explaining the steps performed in these
+We also have a [tutorial](https://pytorch.org/executorch/main/executorch-arm-delegate-tutorial) explaining the steps performed in these
 scripts, expected results, possible problems and more. It is a step-by-step guide
 you can follow to better understand this delegate.
diff --git a/examples/demo-apps/apple_ios/LLaMA/docs/delegates/mps_README.md b/examples/demo-apps/apple_ios/LLaMA/docs/delegates/mps_README.md
index 5c1a7437435..b16f27410af 100644
--- a/examples/demo-apps/apple_ios/LLaMA/docs/delegates/mps_README.md
+++ b/examples/demo-apps/apple_ios/LLaMA/docs/delegates/mps_README.md
@@ -9,7 +9,7 @@ More specifically, it covers:
 ## Prerequisites
 * [Xcode 15](https://developer.apple.com/xcode)
 * [iOS 18 SDK](https://developer.apple.com/ios)
-* Set up your ExecuTorch repo and environment if you haven’t done so by following the [Setting up ExecuTorch](https://pytorch.org/executorch/stable/using-executorch-building-from-source) to set up the repo and dev environment:
+* Set up your ExecuTorch repo and environment if you haven’t done so by following the [Setting up ExecuTorch](https://pytorch.org/executorch/main/using-executorch-building-from-source) to set up the repo and dev environment:
 
 ## Setup ExecuTorch
 In this section, we will need to set up the ExecuTorch repo first with Conda environment management. Make sure you have Conda available in your system (or follow the instructions to install it [here](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)). The commands below are running on Linux (CentOS).
@@ -85,7 +85,7 @@ Link your binary with the ExecuTorch runtime and any backends or kernels used by
 
 Note: To access logs, link against the Debug build of the ExecuTorch runtime, i.e., the executorch_debug framework. For optimal performance, always link against the Release version of the deliverables (those without the _debug suffix), which have all logging overhead removed.
 
-For more details integrating and Running ExecuTorch on Apple Platforms, checkout this [link](https://pytorch.org/executorch/main/using-executorch-ios.html).
+For more details integrating and Running ExecuTorch on Apple Platforms, checkout this [link](https://pytorch.org/executorch/main/using-executorch-ios).
 
 <p align="center">
 <img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_demo_app_swift_pm.png" alt="iOS LLaMA App Swift PM" style="width:600px">
diff --git a/examples/devtools/README.md b/examples/devtools/README.md
index e4fbadfcca0..0b516ad629e 100644
--- a/examples/devtools/README.md
+++ b/examples/devtools/README.md
@@ -17,7 +17,7 @@ examples/devtools
 We will use an example model (in `torch.nn.Module`) and its representative inputs, both from [`models/`](../models) directory, to generate a [BundledProgram(`.bpte`)](../../docs/source/bundled-io.md) file using the [script](scripts/export_bundled_program.py). Then we will use [devtools/example_runner](example_runner/example_runner.cpp) to execute the `.bpte` model on the ExecuTorch runtime and verify the model on BundledProgram API.
 
 
-1. Sets up the basic development environment for ExecuTorch by [Setting up ExecuTorch from GitHub](https://pytorch.org/executorch/stable/getting-started-setup).
+1. Sets up the basic development environment for ExecuTorch by [Setting up ExecuTorch from GitHub](https://pytorch.org/executorch/main/getting-started-setup).
 
 2. Using the [script](scripts/export_bundled_program.py) to generate a BundledProgram binary file by retreiving a `torch.nn.Module` model and its representative inputs from the list of available models in the [`models/`](../models) dir.
 
diff --git a/examples/models/llama/UTILS.md b/examples/models/llama/UTILS.md
index dd014240ace..2d4d4dfd788 100644
--- a/examples/models/llama/UTILS.md
+++ b/examples/models/llama/UTILS.md
@@ -25,7 +25,7 @@ From `executorch` root:
 ## Smaller model delegated to other backends
 
 Currently we supported lowering the stories model to other backends, including, CoreML, MPS and QNN. Please refer to the instruction
-for each backend ([CoreML](https://pytorch.org/executorch/main/build-run-coreml.html), [MPS](https://pytorch.org/executorch/main/build-run-mps.html), [QNN](https://pytorch.org/executorch/main/build-run-qualcomm-ai-engine-direct-backend.html)) before trying to lower them. After the backend library is installed, the script to export a lowered model is
+for each backend ([CoreML](https://pytorch.org/executorch/main/backends-coreml), [MPS](https://pytorch.org/executorch/main/backends-mps), [QNN](https://pytorch.org/executorch/main/backend-qualcomm)) before trying to lower them. After the backend library is installed, the script to export a lowered model is
 
 - Lower to CoreML: `python -m examples.models.llama.export_llama -kv --disable_dynamic_shape --coreml -c stories110M.pt -p params.json `
 - MPS: `python -m examples.models.llama.export_llama -kv --disable_dynamic_shape --mps -c stories110M.pt -p params.json `
diff --git a/examples/models/llama/non_cpu_backends.md b/examples/models/llama/non_cpu_backends.md
index 1ee594ebd83..f414582a3c1 100644
--- a/examples/models/llama/non_cpu_backends.md
+++ b/examples/models/llama/non_cpu_backends.md
@@ -2,7 +2,7 @@
 # Running Llama 3/3.1 8B on non-CPU backends
 
 ### QNN
-Please follow [the instructions](https://pytorch.org/executorch/stable/llm/build-run-llama3-qualcomm-ai-engine-direct-backend.html) to deploy Llama 3 8B to an Android smartphone with Qualcomm SoCs.
+Please follow [the instructions](https://pytorch.org/executorch/main/llm/build-run-llama3-qualcomm-ai-engine-direct-backend) to deploy Llama 3 8B to an Android smartphone with Qualcomm SoCs.
 
 ### MPS
 Export:
@@ -10,7 +10,7 @@ Export:
 python -m examples.models.llama2.export_llama --checkpoint llama3.pt --params params.json -kv --disable_dynamic_shape --mps --use_sdpa_with_kv_cache -d fp32 -qmode 8da4w -G 32 --embedding-quantize 4,32
 ```
 
-After exporting the MPS model .pte file, the [iOS LLAMA](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) app can support running the model. ` --embedding-quantize 4,32` is an optional args for quantizing embedding to reduce the model size.
+After exporting the MPS model .pte file, the [iOS LLAMA](https://pytorch.org/executorch/main/llm/llama-demo-ios) app can support running the model. ` --embedding-quantize 4,32` is an optional args for quantizing embedding to reduce the model size.
 
 ### CoreML
 Export:
diff --git a/examples/models/phi-3-mini-lora/README.md b/examples/models/phi-3-mini-lora/README.md
index 2b7cc0ba401..5bc48bd48f4 100644
--- a/examples/models/phi-3-mini-lora/README.md
+++ b/examples/models/phi-3-mini-lora/README.md
@@ -16,7 +16,7 @@ To see how you can use the model exported for training in a fully involved finet
 python export_model.py
 ```
 
-2. Run the inference model using an example runtime. For more detailed steps on this, check out [Build & Run](https://pytorch.org/executorch/stable/getting-started-setup.html#build-run).
+2. Run the inference model using an example runtime. For more detailed steps on this, check out [Build & Run](https://pytorch.org/executorch/main/getting-started-setup#build-run).
 ```
 # Clean and configure the CMake build system. Compiled programs will appear in the executorch/cmake-out directory we create here.
 ./install_executorch.sh --clean
diff --git a/examples/portable/README.md b/examples/portable/README.md
index a6658197da3..6bfc9aa7281 100644
--- a/examples/portable/README.md
+++ b/examples/portable/README.md
@@ -20,7 +20,7 @@ We will walk through an example model to generate a `.pte` file in [portable mod
 from the [`models/`](../models) directory using scripts in the `portable/scripts` directory. Then we will run on the `.pte` model on the ExecuTorch runtime. For that we will use `executor_runner`.
 
 
-1. Following the setup guide in [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup)
+1. Following the setup guide in [Setting up ExecuTorch](https://pytorch.org/executorch/main/getting-started-setup)
 you should be able to get the basic development environment for ExecuTorch working.
 
 2. Using the script `portable/scripts/export.py` generate a model binary file by selecting a
diff --git a/examples/portable/custom_ops/README.md b/examples/portable/custom_ops/README.md
index db517e84a0c..bf17d6a6753 100644
--- a/examples/portable/custom_ops/README.md
+++ b/examples/portable/custom_ops/README.md
@@ -3,7 +3,7 @@ This folder contains examples to register custom operators into PyTorch as well
 
 ## How to run
 
-Prerequisite: finish the [setting up wiki](https://pytorch.org/executorch/stable/getting-started-setup).
+Prerequisite: finish the [setting up wiki](https://pytorch.org/executorch/main/getting-started-setup).
 
 Run:
 
diff --git a/examples/qualcomm/README.md b/examples/qualcomm/README.md
index bdac58d2bfc..1f7e2d1e476 100644
--- a/examples/qualcomm/README.md
+++ b/examples/qualcomm/README.md
@@ -22,7 +22,7 @@ Here are some general information and limitations.
 
 ## Prerequisite
 
-Please finish tutorial [Setting up executorch](https://pytorch.org/executorch/stable/getting-started-setup).
+Please finish tutorial [Setting up executorch](https://pytorch.org/executorch/main/getting-started-setup).
 
 Please finish [setup QNN backend](../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md).
 
diff --git a/examples/qualcomm/oss_scripts/llama/README.md b/examples/qualcomm/oss_scripts/llama/README.md
index 9b6ec9574eb..27abd5689a0 100644
--- a/examples/qualcomm/oss_scripts/llama/README.md
+++ b/examples/qualcomm/oss_scripts/llama/README.md
@@ -28,7 +28,7 @@ Hybrid Mode: Hybrid mode leverages the strengths of both AR-N model and KV cache
 
 ### Step 1: Setup
 1. Follow the [tutorial](https://pytorch.org/executorch/main/getting-started-setup) to set up ExecuTorch.
-2. Follow the [tutorial](https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html) to build Qualcomm AI Engine Direct Backend.
+2. Follow the [tutorial](https://pytorch.org/executorch/main/build-run-qualcomm-ai-engine-direct-backend) to build Qualcomm AI Engine Direct Backend.
 
 ### Step 2: Prepare Model
 
diff --git a/examples/qualcomm/qaihub_scripts/llama/README.md b/examples/qualcomm/qaihub_scripts/llama/README.md
index 0fec6ea867f..4d010b5d474 100644
--- a/examples/qualcomm/qaihub_scripts/llama/README.md
+++ b/examples/qualcomm/qaihub_scripts/llama/README.md
@@ -12,7 +12,7 @@ Note that the pre-compiled context binaries could not be futher fine-tuned for o
 ### Instructions
 #### Step 1: Setup
 1. Follow the [tutorial](https://pytorch.org/executorch/main/getting-started-setup) to set up ExecuTorch.
-2. Follow the [tutorial](https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html) to build Qualcomm AI Engine Direct Backend.
+2. Follow the [tutorial](https://pytorch.org/executorch/main/build-run-qualcomm-ai-engine-direct-backend) to build Qualcomm AI Engine Direct Backend.
 
 #### Step2: Prepare Model
 1. Create account for https://aihub.qualcomm.com/
@@ -40,7 +40,7 @@ Note that the pre-compiled context binaries could not be futher fine-tuned for o
 ### Instructions
 #### Step 1: Setup
 1. Follow the [tutorial](https://pytorch.org/executorch/main/getting-started-setup) to set up ExecuTorch.
-2. Follow the [tutorial](https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html) to build Qualcomm AI Engine Direct Backend.
+2. Follow the [tutorial](https://pytorch.org/executorch/main/build-run-qualcomm-ai-engine-direct-backend) to build Qualcomm AI Engine Direct Backend.
 
 #### Step2: Prepare Model
 1. Create account for https://aihub.qualcomm.com/
diff --git a/examples/qualcomm/qaihub_scripts/stable_diffusion/README.md b/examples/qualcomm/qaihub_scripts/stable_diffusion/README.md
index b008d3135d4..998c97d78e3 100644
--- a/examples/qualcomm/qaihub_scripts/stable_diffusion/README.md
+++ b/examples/qualcomm/qaihub_scripts/stable_diffusion/README.md
@@ -11,7 +11,7 @@ The model architecture, scheduler, and time embedding are from the [stabilityai/
 ### Instructions
 #### Step 1: Setup
 1. Follow the [tutorial](https://pytorch.org/executorch/main/getting-started-setup) to set up ExecuTorch.
-2. Follow the [tutorial](https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html) to build Qualcomm AI Engine Direct Backend.
+2. Follow the [tutorial](https://pytorch.org/executorch/main/build-run-qualcomm-ai-engine-direct-backend) to build Qualcomm AI Engine Direct Backend.
 
 #### Step2: Prepare Model
 1. Download the context binaries for TextEncoder, UNet, and VAEDecoder under https://huggingface.co/qualcomm/Stable-Diffusion-v2.1/tree/main
diff --git a/examples/selective_build/README.md b/examples/selective_build/README.md
index 6c655e18a3d..97706d70c48 100644
--- a/examples/selective_build/README.md
+++ b/examples/selective_build/README.md
@@ -3,7 +3,7 @@ To optimize binary size of ExecuTorch runtime, selective build can be used. This
 
 ## How to run
 
-Prerequisite: finish the [setting up wiki](https://pytorch.org/executorch/stable/getting-started-setup).
+Prerequisite: finish the [setting up wiki](https://pytorch.org/executorch/main/getting-started-setup).
 
 Run:
 
diff --git a/examples/xnnpack/README.md b/examples/xnnpack/README.md
index 179e47004a1..56deff928af 100644
--- a/examples/xnnpack/README.md
+++ b/examples/xnnpack/README.md
@@ -1,8 +1,8 @@
 # XNNPACK Backend
 
 [XNNPACK](https://github.com/google/XNNPACK) is a library of optimized neural network operators for ARM and x86 CPU platforms. Our delegate lowers models to run using these highly optimized CPU operators. You can try out lowering and running some example models in the demo. Please refer to the following docs for information on the XNNPACK Delegate
-- [XNNPACK Backend Delegate Overview](https://pytorch.org/executorch/stable/native-delegates-executorch-xnnpack-delegate.html)
-- [XNNPACK Delegate Export Tutorial](https://pytorch.org/executorch/stable/tutorial-xnnpack-delegate-lowering.html)
+- [XNNPACK Backend Delegate Overview](https://pytorch.org/executorch/main/native-delegates-executorch-xnnpack-delegate)
+- [XNNPACK Delegate Export Tutorial](https://pytorch.org/executorch/main/tutorial-xnnpack-delegate-lowering)
 
 
 ## Directory structure
@@ -60,7 +60,7 @@ Now finally you should be able to run this model with the following command
 ```
 
 ## Quantization
-First, learn more about the generic PyTorch 2 Export Quantization workflow in the [Quantization Flow Docs](https://pytorch.org/executorch/stable/quantization-overview.html), if you are not familiar already.
+First, learn more about the generic PyTorch 2 Export Quantization workflow in the [Quantization Flow Docs](https://pytorch.org/executorch/main/quantization-overview), if you are not familiar already.
 
 Here we will discuss quantizing a model suitable for XNNPACK delegation using XNNPACKQuantizer.
 
diff --git a/exir/program/_program.py b/exir/program/_program.py
index ef857ffd011..da5ca06c927 100644
--- a/exir/program/_program.py
+++ b/exir/program/_program.py
@@ -1325,7 +1325,7 @@ def to_edge(
 class EdgeProgramManager:
     """
     Package of one or more `ExportedPrograms` in Edge dialect. Designed to simplify
-    lowering to ExecuTorch. See: https://pytorch.org/executorch/stable/ir-exir.html
+    lowering to ExecuTorch. See: https://pytorch.org/executorch/main/ir-exir
 
     Allows easy applications of transforms across a collection of exported programs
     including the delegation of subgraphs.
@@ -1565,7 +1565,7 @@ def to_executorch(
 class ExecutorchProgramManager:
     """
     Package of one or more `ExportedPrograms` in Execution dialect. Designed to simplify
-    lowering to ExecuTorch. See: https://pytorch.org/executorch/stable/ir-exir.html
+    lowering to ExecuTorch. See: https://pytorch.org/executorch/main/ir-exir
 
     When the ExecutorchProgramManager is constructed the ExportedPrograms in execution dialect
     are used to form the executorch binary (in a process called emission) and then serialized
diff --git a/extension/android/README.md b/extension/android/README.md
index 8972e615173..1274a18e447 100644
--- a/extension/android/README.md
+++ b/extension/android/README.md
@@ -28,7 +28,7 @@ export ANDROID_NDK=/path/to/ndk
 sh scripts/build_android_library.sh
 ```
 
-Please see [Android building from source](https://pytorch.org/executorch/main/using-executorch-android.html#building-from-source) for details
+Please see [Android building from source](https://pytorch.org/executorch/main/using-executorch-android#building-from-source) for details
 
 ## Test
 
diff --git a/extension/pybindings/pybindings.pyi b/extension/pybindings/pybindings.pyi
index 64ea14f08ff..7aede1c29a9 100644
--- a/extension/pybindings/pybindings.pyi
+++ b/extension/pybindings/pybindings.pyi
@@ -161,7 +161,7 @@ def _load_for_executorch(
     Args:
         path: File path to the ExecuTorch program as a string.
         enable_etdump: If true, enables an ETDump which can store profiling information.
-            See documentation at https://pytorch.org/executorch/stable/etdump.html
+            See documentation at https://pytorch.org/executorch/main/etdump
             for how to use it.
         debug_buffer_size: If non-zero, enables a debug buffer which can store
             intermediate results of each instruction in the ExecuTorch program.
@@ -192,7 +192,7 @@ def _load_for_executorch_from_bundled_program(
 ) -> ExecuTorchModule:
     """Same as _load_for_executorch, but takes a bundled program instead of a file path.
 
-    See https://pytorch.org/executorch/stable/bundled-io.html for documentation.
+    See https://pytorch.org/executorch/main/bundled-io for documentation.
 
     .. warning::
 
diff --git a/runtime/COMPATIBILITY.md b/runtime/COMPATIBILITY.md
index 7d9fd47c590..583dab172cc 100644
--- a/runtime/COMPATIBILITY.md
+++ b/runtime/COMPATIBILITY.md
@@ -1,7 +1,7 @@
 # Runtime Compatibility Policy
 
 This document describes the compatibility guarantees between the [PTE file
-format](https://pytorch.org/executorch/stable/pte-file-format.html) and the
+format](https://pytorch.org/executorch/main/pte-file-format) and the
 ExecuTorch runtime.
 
 > [!IMPORTANT]