Commit 7b367bd
committed
Qualcomm AI Engine Direct - Add QNN support for to_edge_transform_and_lower
summary:
- Support `to_edge_transform_and_lower`
- Replace capture_program with new API `to_edge_transform_and_lower_to_qnn`
- Replace capture_program with to_edge_transform_and_lower_to_qnn for unit_test
- Replace capture_program with to_edge_transform_and_lower_to_qnn for examples
- Replace capture_program with to_edge_transform_and_lower_to_qnn for llama
- Add QnnPassManager to manage all passes in different stage
- Deprecated _transform in export_llama_lib with qnn_pass_manager
- Add transform_for_export_pipeline for LiftConstantScalarOperands to avoid creating temporary tensors in the operation builder.
However, this pass will create a get_attr node, which should be converted into a lifted tensor constant by the lift_constant_tensor_pass.
If placed in the to_edge_transform_passes, it will be executed after the lift_constant_tensor_pass,
causing the operation builder to fail to correctly retrieve the parameter by the get_parameter for get_attr node.
- Refactor the passes
- Fix the output dtype doesn't match in runtime after build quant io
- Combine constant_i64_to_i32 and tensor_i64_to_i32 into i64_to_i32
- Replace convert_to_linear pass with fixed_linear_keep_dim pass
- Since QNN has no keep dims for linear op, we will need to add squeeze and unsqueeze around linear node
- Add TagQuantIO pass to tag io nodes to avoid inserting q/dq in qnn_preprocess
- Add prelu, leaky_relu, linear, rms_norm into decompose_table
- Remove recompose_prelu.py
- Remove unused variable in insert_requantize.py, and replace_index_put_input.py
- Support aten.split_with_sizes_copy.default
- Support leaky_relu with inplace=True1 parent 7159650 commit 7b367bd
File tree
33 files changed
+967
-1179
lines changed- backends/qualcomm
- _passes
- builders
- partition
- quantizer
- tests
- utils
- examples
- models/llama
- qualcomm
- executor_runner
- oss_scripts
- llama
- scripts
- extension/llm/custom_ops
33 files changed
+967
-1179
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | 3 | | |
5 | | - | |
6 | 4 | | |
7 | 5 | | |
8 | 6 | | |
9 | 7 | | |
10 | 8 | | |
| 9 | + | |
11 | 10 | | |
12 | 11 | | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | 18 | | |
20 | 19 | | |
21 | 20 | | |
22 | 21 | | |
23 | 22 | | |
24 | | - | |
| 23 | + | |
25 | 24 | | |
26 | 25 | | |
27 | 26 | | |
28 | 27 | | |
29 | 28 | | |
30 | | - | |
31 | 29 | | |
32 | | - | |
33 | | - | |
34 | 30 | | |
35 | 31 | | |
36 | 32 | | |
37 | 33 | | |
38 | 34 | | |
| 35 | + | |
39 | 36 | | |
40 | 37 | | |
| 38 | + | |
41 | 39 | | |
42 | 40 | | |
43 | 41 | | |
| |||
48 | 46 | | |
49 | 47 | | |
50 | 48 | | |
51 | | - | |
| 49 | + | |
52 | 50 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
39 | 38 | | |
40 | 39 | | |
41 | 40 | | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | 41 | | |
50 | 42 | | |
51 | 43 | | |
| |||
This file was deleted.
This file was deleted.
0 commit comments