Skip to content

Commit 7c9edb7

Browse files
committed
Addressed review comments and added a section on why AOTI Python
1 parent 3fa9b20 commit 7c9edb7

File tree

1 file changed

+13
-12
lines changed

1 file changed

+13
-12
lines changed

intermediate_source/torch_export_aoti_python.py

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# -*- coding: utf-8 -*-
22

33
"""
4-
(Beta) ``torch.export`` AOT Inductor Tutorial for Python runtime
4+
(Beta) ``torch.export`` AOTInductor Tutorial for Python runtime
55
===================================================
66
**Author:** Ankith Gunapal
77
"""
@@ -31,7 +31,7 @@
3131
# Prerequisites
3232
# -------------
3333
# * PyTorch 2.4 or later
34-
# * Basic understanding of ``torch._export`` and AOT Inductor
34+
# * Basic understanding of ``torch._export`` and AOTInductor
3535
# * Complete the `AOTInductor: Ahead-Of-Time Compilation for Torch.Export-ed Models <https://pytorch.org/docs/stable/torch.compiler_aot_inductor.html#>`_ tutorial
3636

3737
######################################################################
@@ -40,6 +40,7 @@
4040
# * How to use AOTInductor for python runtime.
4141
# * How to use :func:`torch._export.aot_compile` to generate a shared library
4242
# * How to run a shared library in Python runtime using :func:`torch._export.aot_load`.
43+
# * When do you use AOTInductor for python runtime
4344

4445
######################################################################
4546
# Model Compilation
@@ -124,24 +125,24 @@
124125
output = model(example_inputs)
125126

126127
######################################################################
127-
# When to use AOT Inductor Python Runtime
128+
# When to use AOTInductor for Python Runtime
128129
# ---------------------------------------
129130
#
130-
# One of the requirements for using AOT Inductor is that the model shouldn't have any graph breaks.
131-
# Once this requirement is met, the primary use case for using AOT Inductor Python Runtime is for
131+
# One of the requirements for using AOTInductor is that the model shouldn't have any graph breaks.
132+
# Once this requirement is met, the primary use case for using AOTInductor Python Runtime is for
132133
# model deployment using Python.
133-
# There are mainly two reasons why you would use AOT Inductor Python Runtime:
134+
# There are mainly two reasons why you would use AOTInductor Python Runtime:
134135
#
135136
# - ``torch._export.aot_compile`` generates a shared library. This is useful for model
136137
# versioning for deployments and tracking model performance over time.
137138
# - With :func:`torch.compile` being a JIT compiler, there is a warmup
138139
# cost associated with the first compilation. Your deployment needs to account for the
139-
# compilation time taken for the first inference. With AOT Inductor, the compilation is
140+
# compilation time taken for the first inference. With AOTInductor, the compilation is
140141
# done offline using ``torch._export.aot_compile``. The deployment would only load the
141142
# shared library using ``torch._export.aot_load`` and run inference.
142143
#
143144
#
144-
# The section below shows the speedup achieved with AOT Inductor for first inference
145+
# The section below shows the speedup achieved with AOTInductor for first inference
145146
#
146147
# We define a utility function ``timed`` to measure the time taken for inference
147148
#
@@ -175,7 +176,7 @@ def timed(fn):
175176

176177

177178
######################################################################
178-
# Lets measure the time for first inference using AOT Inductor
179+
# Lets measure the time for first inference using AOTInductor
179180

180181
torch._dynamo.reset()
181182

@@ -184,7 +185,7 @@ def timed(fn):
184185

185186
with torch.inference_mode():
186187
_, time_taken = timed(lambda: model(example_inputs))
187-
print(f"Time taken for first inference for AOT Inductor is {time_taken:.2f} ms")
188+
print(f"Time taken for first inference for AOTInductor is {time_taken:.2f} ms")
188189

189190

190191
######################################################################
@@ -203,7 +204,7 @@ def timed(fn):
203204
print(f"Time taken for first inference for torch.compile is {time_taken:.2f} ms")
204205

205206
######################################################################
206-
# We see that there is a drastic speedup in first inference time using AOT Inductor compared
207+
# We see that there is a drastic speedup in first inference time using AOTInductor compared
207208
# to ``torch.compile``
208209

209210
######################################################################
@@ -215,4 +216,4 @@ def timed(fn):
215216
# and ``torch._export.aot_load`` APIs. This process demonstrates the practical application of
216217
# generating a shared library and running it within a Python environment, even with dynamic shape
217218
# considerations and device-specific optimizations. We also looked at the advantage of using
218-
# AOT Inductor in model deployments, with regards to speed up in first inference time.
219+
# AOTInductor in model deployments, with regards to speed up in first inference time.

0 commit comments

Comments
 (0)