New doc for the memory planning inspection util function (#5430)

Olivia-liu · facebook-github-bot · commit 7127ea9cf5cf · 2024-09-26T17:55:43.000-07:00
Summary: Pull Request resolved: #5430 ## Why The memory planning inspection tool `generate_memory_trace()` has proved to be useful by users, and we will be presenting it in the executorch debugging and profiling lightning talk at PTC, so it's important to have documentation for it which users can refer to. ## What A new page in the official doc that talks about how to use the memory planning inspection tool `generate_memory_trace()`. Reviewed By: dbort Differential Revision: D62796663 fbshipit-source-id: 6c868590d805e9f21bccc7023131eb814b96e5b1
diff --git a/docs/source/_static/img/memory_planning_inspection.png b/docs/source/_static/img/memory_planning_inspection.png
diff --git a/docs/source/compiler-memory-planning.md b/docs/source/compiler-memory-planning.md
@@ -83,3 +83,7 @@ program = edge_program.to_executorch(
 ```
 
 Users attempting to write a custom memory planning algorithm should start by looking at [the greedy algorithm's implementation](https://github.com/pytorch/executorch/blob/d62c41ca86435e5316e7ed292b6d68aff27a2fb7/exir/memory_planning.py#L459C1-L459C12).
+
+## Debugging Tool
+
+Please refer to [Memory Planning Inspection](./memory-planning-inspection.md) for a tool to inspect the result of memory planning.
diff --git a/docs/source/devtools-overview.md b/docs/source/devtools-overview.md
@@ -15,6 +15,7 @@ The ExecuTorch Developer Tools support the following features:
 - **Delegate Integration** - Surfacing performance details from delegate backends
     - Link back delegate operator execution to the nodes they represent in the edge dialect graph (and subsequently linking back to source code and module hierarchy)
 - **Debugging** - Intermediate outputs and output quality analysis
+- **Memory Allocation Insights** - Visualize how memory is planned, where all the live tensors are at any point in time
 - **Visualization** - Coming soon
 
 ## Fundamental components of the Developer Tools
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -207,6 +207,7 @@ Topics in this section will help you get started with ExecuTorch.
    sdk-profiling
    sdk-debugging
    sdk-inspector
+   memory-planning-inspection
    sdk-delegate-integration
    devtools-tutorial
 
diff --git a/docs/source/memory-planning-inspection.md b/docs/source/memory-planning-inspection.md
@@ -0,0 +1,30 @@
+# Memory Planning Inspection in ExecuTorch
+
+After the [Memory Planning](https://pytorch.org/executorch/main/concepts.html#memory-planning) pass of ExecuTorch, memory allocation information is stored on the nodes of the [`ExportedProgram`](https://pytorch.org/executorch/main/concepts.html#exportedprogram). Here, we present a tool designed to inspect memory allocation and visualize all active tensor objects.
+
+## Usage
+User should add this code after they call [to_executorch()](https://pytorch.org/executorch/main/export-to-executorch-api-reference.html#executorch.exir.EdgeProgramManager.to_executorch), and it will write memory allocation information stored on the nodes to the file path "memory_profile.json". The file is compatible with the Chrome trace viewer; see below for more information about interpreting the results.
+
+```python
+from executorch.util.activation_memory_profiler import generate_memory_trace
+generate_memory_trace(
+    executorch_program_manager=prog,
+    chrome_trace_filename="memory_profile.json",
+    enable_memory_offsets=True,
+)
+```
+* `prog` is an instance of [`ExecuTorchProgramManager`](https://pytorch.org/executorch/main/export-to-executorch-api-reference.html#executorch.exir.ExecutorchProgramManager), returned by [to_executorch()](https://pytorch.org/executorch/main/export-to-executorch-api-reference.html#executorch.exir.EdgeProgramManager.to_executorch).
+* Set `enable_memory_offsets` to `True` to show the location of each tensor on the memory space.
+
+## Chrome Trace
+Open a Chrome browser tab and navigate to <chrome://tracing/>. Upload the generated `.json` to view.
+Example of a [MobileNet V2](https://pytorch.org/vision/main/models/mobilenetv2.html) model:
+
+![Memory planning Chrome trace visualization](/_static/img/memory_planning_inspection.png)
+
+Note that, since we are repurposing the Chrome trace tool, the axes in this context may have different meanings compared to other Chrome trace graphs you may have encountered previously:
+* The horizontal axis, despite being labeled in seconds (s), actually represents megabytes (MBs).
+* The vertical axis has a 2-level hierarchy. The first level, "pid", represents memory space. For CPU, everything is allocated on one "space"; other backends may have multiple. In the second level, each row represents one time step. Since nodes will be executed sequentially, each node represents one time step, thus you will have as many nodes as there are rows.
+
+## Further Reading
+* [Memory Planning](https://pytorch.org/executorch/main/compiler-memory-planning.html)