Skip to content

Commit 6681225

Browse files
committed
Redirecting (Beta) Convert MobileNetV2 to NNAPI to Executorch.
1 parent ac7667c commit 6681225

File tree

1 file changed

+4
-212
lines changed

1 file changed

+4
-212
lines changed
Lines changed: 4 additions & 212 deletions
Original file line numberDiff line numberDiff line change
@@ -1,218 +1,10 @@
11
(Beta) Convert MobileNetV2 to NNAPI
22
========================================
33

4-
Introduction
5-
------------
4+
PyTorch Mobile is no longer actively supported. Please check out Executorch.
65

7-
This tutorial shows how to prepare a computer vision model to use
8-
`Android's Neural Networks API (NNAPI) <https://developer.android.com/ndk/guides/neuralnetworks>`_.
9-
NNAPI provides access to powerful and efficient computational cores
10-
on many modern Android devices.
6+
Redirecting in 3 seconds...
117

12-
PyTorch's NNAPI is currently in the "prototype" phase and only supports
13-
a limited range of operators, but we expect to solidify the integration
14-
and expand our operator support over time.
8+
.. raw:: html
159

16-
17-
Environment
18-
-----------
19-
20-
Install PyTorch and torchvision.
21-
22-
``pip install torch==1.10.0 torchvision==0.11.1``
23-
24-
25-
Model Preparation
26-
-----------------
27-
28-
First, we must prepare our model to execute with NNAPI.
29-
This step runs on your training server or laptop.
30-
The key conversion function to call is
31-
``torch.backends._nnapi.prepare.convert_model_to_nnapi``,
32-
but some extra steps are required to ensure that
33-
the model is properly structured.
34-
Most notably, quantizing the model is required
35-
in order to run the model on certain accelerators.
36-
37-
You can copy/paste this entire Python script and run it,
38-
or make your own modifications.
39-
By default, it will save the models to ``~/mobilenetv2-nnapi/``.
40-
Please create that directory first.
41-
42-
.. code:: python
43-
44-
#!/usr/bin/env python
45-
import sys
46-
import os
47-
import torch
48-
import torch.utils.bundled_inputs
49-
import torch.utils.mobile_optimizer
50-
import torch.backends._nnapi.prepare
51-
import torchvision.models.quantization.mobilenet
52-
from pathlib import Path
53-
54-
55-
# This script supports 3 modes of quantization:
56-
# - "none": Fully floating-point model.
57-
# - "core": Quantize the core of the model, but wrap it a
58-
# quantizer/dequantizer pair, so the interface uses floating point.
59-
# - "full": Quantize the model, and use quantized tensors
60-
# for input and output.
61-
#
62-
# "none" maintains maximum accuracy
63-
# "core" sacrifices some accuracy for performance,
64-
# but maintains the same interface.
65-
# "full" maximized performance (with the same accuracy as "core"),
66-
# but requires the application to use quantized tensors.
67-
#
68-
# There is a fourth option, not supported by this script,
69-
# where we include the quant/dequant steps as NNAPI operators.
70-
def make_mobilenetv2_nnapi(output_dir_path, quantize_mode):
71-
quantize_core, quantize_iface = {
72-
"none": (False, False),
73-
"core": (True, False),
74-
"full": (True, True),
75-
}[quantize_mode]
76-
77-
model = torchvision.models.quantization.mobilenet.mobilenet_v2(pretrained=True, quantize=quantize_core)
78-
model.eval()
79-
80-
# Fuse BatchNorm operators in the floating point model.
81-
# (Quantized models already have this done.)
82-
# Remove dropout for this inference-only use case.
83-
if not quantize_core:
84-
model.fuse_model()
85-
assert type(model.classifier[0]) == torch.nn.Dropout
86-
model.classifier[0] = torch.nn.Identity()
87-
88-
input_float = torch.zeros(1, 3, 224, 224)
89-
input_tensor = input_float
90-
91-
# If we're doing a quantized model, we need to trace only the quantized core.
92-
# So capture the quantizer and dequantizer, use them to prepare the input,
93-
# and replace them with identity modules so we can trace without them.
94-
if quantize_core:
95-
quantizer = model.quant
96-
dequantizer = model.dequant
97-
model.quant = torch.nn.Identity()
98-
model.dequant = torch.nn.Identity()
99-
input_tensor = quantizer(input_float)
100-
101-
# Many NNAPI backends prefer NHWC tensors, so convert our input to channels_last,
102-
# and set the "nnapi_nhwc" attribute for the converter.
103-
input_tensor = input_tensor.contiguous(memory_format=torch.channels_last)
104-
input_tensor.nnapi_nhwc = True
105-
106-
# Trace the model. NNAPI conversion only works with TorchScript models,
107-
# and traced models are more likely to convert successfully than scripted.
108-
with torch.no_grad():
109-
traced = torch.jit.trace(model, input_tensor)
110-
nnapi_model = torch.backends._nnapi.prepare.convert_model_to_nnapi(traced, input_tensor)
111-
112-
# If we're not using a quantized interface, wrap a quant/dequant around the core.
113-
if quantize_core and not quantize_iface:
114-
nnapi_model = torch.nn.Sequential(quantizer, nnapi_model, dequantizer)
115-
model.quant = quantizer
116-
model.dequant = dequantizer
117-
# Switch back to float input for benchmarking.
118-
input_tensor = input_float.contiguous(memory_format=torch.channels_last)
119-
120-
# Optimize the CPU model to make CPU-vs-NNAPI benchmarks fair.
121-
model = torch.utils.mobile_optimizer.optimize_for_mobile(torch.jit.script(model))
122-
123-
# Bundle sample inputs with the models for easier benchmarking.
124-
# This step is optional.
125-
class BundleWrapper(torch.nn.Module):
126-
def __init__(self, mod):
127-
super().__init__()
128-
self.mod = mod
129-
def forward(self, arg):
130-
return self.mod(arg)
131-
nnapi_model = torch.jit.script(BundleWrapper(nnapi_model))
132-
torch.utils.bundled_inputs.augment_model_with_bundled_inputs(
133-
model, [(torch.utils.bundled_inputs.bundle_large_tensor(input_tensor),)])
134-
torch.utils.bundled_inputs.augment_model_with_bundled_inputs(
135-
nnapi_model, [(torch.utils.bundled_inputs.bundle_large_tensor(input_tensor),)])
136-
137-
# Save both models.
138-
model._save_for_lite_interpreter(str(output_dir_path / ("mobilenetv2-quant_{}-cpu.pt".format(quantize_mode))))
139-
nnapi_model._save_for_lite_interpreter(str(output_dir_path / ("mobilenetv2-quant_{}-nnapi.pt".format(quantize_mode))))
140-
141-
142-
if __name__ == "__main__":
143-
for quantize_mode in ["none", "core", "full"]:
144-
make_mobilenetv2_nnapi(Path(os.environ["HOME"]) / "mobilenetv2-nnapi", quantize_mode)
145-
146-
147-
Running Benchmarks
148-
------------------
149-
150-
Now that the models are ready, we can benchmark them on our Android devices.
151-
See `our performance recipe <https://pytorch.org/tutorials/recipes/mobile_perf.html#android-benchmarking-setup>`_ for details.
152-
The best-performing models are likely to be the "fully-quantized" models:
153-
``mobilenetv2-quant_full-cpu.pt`` and ``mobilenetv2-quant_full-nnapi.pt``.
154-
155-
Because these models have bundled inputs, we can run the benchmark as follows:
156-
157-
.. code:: shell
158-
159-
./speed_benchmark_torch --pthreadpool_size=1 --model=mobilenetv2-quant_full-nnapi.pt --use_bundled_input=0 --warmup=5 --iter=200
160-
161-
Adjusting increasing the thread pool size can can reduce latency,
162-
at the cost of increased CPU usage.
163-
Omitting that argument will use one thread per big core.
164-
The CPU models can get improved performance (at the cost of memory usage)
165-
by passing ``--use_caching_allocator=true``.
166-
167-
168-
Running model on host
169-
---------------------
170-
171-
We can now run models on your linux machine using the reference implementation
172-
of NNAPI. You need to build the NNAPI library from Android source code:
173-
174-
* Make sure you have at least 200GB of disk space
175-
* Follow `these instructions <https://source.android.com/setup/develop#installing-repo>`_ to install ``repo``
176-
177-
.. code:: shell
178-
179-
mkdir ~/android-nnapi && cd ~/android-nnapi
180-
repo init -u https://android.googlesource.com/platform/manifest -b master
181-
repo sync --network-only -j 16
182-
repo sync -l
183-
. build/envsetup.sh
184-
lunch aosp_x86_64-eng
185-
mm -j16 out/host/linux-x86/lib64/libneuralnetworks.so
186-
187-
188-
With the host build of ``libneuralnetworks.so`` you can run Pytorch NNAPI models on
189-
your linux machine:
190-
191-
.. code:: python
192-
193-
#!/usr/bin/env python
194-
import ctypes
195-
import torch
196-
from pathlib import Path
197-
198-
ctypes.cdll.LoadLibrary(Path.home() / "android-nnapi/out/host/linux-x86/lib64/libneuralnetworks.so")
199-
model = torch.jit.load(Path.home() / "mobilenetv2-nnapi/mobilenetv2-quant_full-nnapi.pt")
200-
print(model(*model.get_all_bundled_inputs()[0]))
201-
202-
203-
Integration
204-
-----------
205-
206-
The converted models are ordinary TorchScript models.
207-
You can use them in your app just like any other PyTorch model.
208-
See `https://pytorch.org/mobile/android/ <https://pytorch.org/mobile/android/>`_
209-
for an introduction to using PyTorch on Android.
210-
211-
212-
Learn More
213-
----------
214-
215-
- Learn more about optimization in our
216-
`Mobile Performance Recipe <https://pytorch.org/tutorials/recipes/mobile_perf.html>`_
217-
- `MobileNetV2 <https://pytorch.org/hub/pytorch_vision_mobilenet_v2/>`_ from torchvision
218-
- Information about `NNAPI <https://developer.android.com/ndk/guides/neuralnetworks>`_
10+
<meta http-equiv="Refresh" content="3; url='https://pytorch.org/executorch/stable/index.html'" />

0 commit comments

Comments
 (0)