Skip to content
This repository was archived by the owner on Aug 21, 2025. It is now read-only.

Commit 0f5cd36

Browse files
committed
made some minor updates to the notebook
1 parent 4b9e9b2 commit 0f5cd36

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

notebooks/aot_autograd_optimizations.ipynb

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -374,6 +374,18 @@
374374
"source": [
375375
"We observe that both forward and backward latency improve over the default partitioner (and a lot better than eager). Fewer outputs in the forward pass and fewer inputs in the backward pass, along with fusion, allows better memory bandwidth utilization leading to further speedups."
376376
]
377+
},
378+
{
379+
"cell_type": "markdown",
380+
"metadata": {},
381+
"source": [
382+
"## Actual Usage\n",
383+
"For actual usage on CUDA devices, we've wrapped AOTAutograd in a convenient wrapper - `memory_efficient_fusion`. Use this for fusion on GPU!\n",
384+
"\n",
385+
"```\n",
386+
"from functorch.compile import memory_efficient_fusion\n",
387+
"```\n"
388+
]
377389
}
378390
],
379391
"metadata": {
@@ -395,7 +407,7 @@
395407
"name": "python",
396408
"nbconvert_exporter": "python",
397409
"pygments_lexer": "ipython3",
398-
"version": "3.8.12"
410+
"version": "3.8.13"
399411
},
400412
"orig_nbformat": 4
401413
},

0 commit comments

Comments
 (0)