1
1
# -*- coding: utf-8 -*-
2
2
3
3
"""
4
- (Beta) ``torch.export`` AOT Inductor Tutorial for Python runtime
4
+ (Beta) ``torch.export`` AOTInductor Tutorial for Python runtime
5
5
===================================================
6
6
**Author:** Ankith Gunapal
7
7
"""
31
31
# Prerequisites
32
32
# -------------
33
33
# * PyTorch 2.4 or later
34
- # * Basic understanding of ``torch._export`` and AOT Inductor
34
+ # * Basic understanding of ``torch._export`` and AOTInductor
35
35
# * Complete the `AOTInductor: Ahead-Of-Time Compilation for Torch.Export-ed Models <https://pytorch.org/docs/stable/torch.compiler_aot_inductor.html#>`_ tutorial
36
36
37
37
######################################################################
40
40
# * How to use AOTInductor for python runtime.
41
41
# * How to use :func:`torch._export.aot_compile` to generate a shared library
42
42
# * How to run a shared library in Python runtime using :func:`torch._export.aot_load`.
43
+ # * When do you use AOTInductor for python runtime
43
44
44
45
######################################################################
45
46
# Model Compilation
124
125
output = model (example_inputs )
125
126
126
127
######################################################################
127
- # When to use AOT Inductor Python Runtime
128
+ # When to use AOTInductor for Python Runtime
128
129
# ---------------------------------------
129
130
#
130
- # One of the requirements for using AOT Inductor is that the model shouldn't have any graph breaks.
131
- # Once this requirement is met, the primary use case for using AOT Inductor Python Runtime is for
131
+ # One of the requirements for using AOTInductor is that the model shouldn't have any graph breaks.
132
+ # Once this requirement is met, the primary use case for using AOTInductor Python Runtime is for
132
133
# model deployment using Python.
133
- # There are mainly two reasons why you would use AOT Inductor Python Runtime:
134
+ # There are mainly two reasons why you would use AOTInductor Python Runtime:
134
135
#
135
136
# - ``torch._export.aot_compile`` generates a shared library. This is useful for model
136
137
# versioning for deployments and tracking model performance over time.
137
138
# - With :func:`torch.compile` being a JIT compiler, there is a warmup
138
139
# cost associated with the first compilation. Your deployment needs to account for the
139
- # compilation time taken for the first inference. With AOT Inductor , the compilation is
140
+ # compilation time taken for the first inference. With AOTInductor , the compilation is
140
141
# done offline using ``torch._export.aot_compile``. The deployment would only load the
141
142
# shared library using ``torch._export.aot_load`` and run inference.
142
143
#
143
144
#
144
- # The section below shows the speedup achieved with AOT Inductor for first inference
145
+ # The section below shows the speedup achieved with AOTInductor for first inference
145
146
#
146
147
# We define a utility function ``timed`` to measure the time taken for inference
147
148
#
@@ -175,7 +176,7 @@ def timed(fn):
175
176
176
177
177
178
######################################################################
178
- # Lets measure the time for first inference using AOT Inductor
179
+ # Lets measure the time for first inference using AOTInductor
179
180
180
181
torch ._dynamo .reset ()
181
182
@@ -184,7 +185,7 @@ def timed(fn):
184
185
185
186
with torch .inference_mode ():
186
187
_ , time_taken = timed (lambda : model (example_inputs ))
187
- print (f"Time taken for first inference for AOT Inductor is { time_taken :.2f} ms" )
188
+ print (f"Time taken for first inference for AOTInductor is { time_taken :.2f} ms" )
188
189
189
190
190
191
######################################################################
@@ -203,7 +204,7 @@ def timed(fn):
203
204
print (f"Time taken for first inference for torch.compile is { time_taken :.2f} ms" )
204
205
205
206
######################################################################
206
- # We see that there is a drastic speedup in first inference time using AOT Inductor compared
207
+ # We see that there is a drastic speedup in first inference time using AOTInductor compared
207
208
# to ``torch.compile``
208
209
209
210
######################################################################
@@ -215,4 +216,4 @@ def timed(fn):
215
216
# and ``torch._export.aot_load`` APIs. This process demonstrates the practical application of
216
217
# generating a shared library and running it within a Python environment, even with dynamic shape
217
218
# considerations and device-specific optimizations. We also looked at the advantage of using
218
- # AOT Inductor in model deployments, with regards to speed up in first inference time.
219
+ # AOTInductor in model deployments, with regards to speed up in first inference time.
0 commit comments