PLDI work by gussmith23 · Pull Request #24 · uwsampl/3la-tvm

gussmith23 · 2021-12-15T00:38:54Z

DO NOT MERGE

PR for tracking all of the changes we made for PLDI. Useful for seeing the diff.

… ceil_mode and count_include_pad are True. (#9835) * Added the offset[i] for getting the correct boundary * Added corresponding test case

This was inadvertently removed by #9554 Co-authored-by: driazati <driazati@users.noreply.github.com>

This is necessary to make the Rust bindings work.

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

* Update Dockerfile.ci_arm * Update Dockerfile.ci_cpu * Update Dockerfile.ci_gpu * Update Dockerfile.ci_i386 * Update Dockerfile.ci_lint * Update Dockerfile.ci_qemu * Update Dockerfile.ci_wasm

* [USMP] Hill Climb allocator This PR adds HillClimb allocator "tir.usmp.algo.hill_climb" to the memory allocation algorithm set. Change-Id: Ib7485df93757eb512da040528ec86c920db8d03b * requested changes Change-Id: I6700a24c1608d92f87be7dde33cc24f5de1f7063 * Conda-related linter small fixes Change-Id: I0dac5c6d75ade8f813b077c8708aad59d2722933 * Moved implementation from greedy.h to greedy.cc Change-Id: If8ed159eceef32d3f22b51e0252161d09222eb1e * Integrated into test_tir_usmp_algo.py unit test Added "hill_climb" into test_tir_usmp_algo.py Amended sorting to be consistent with "greedy" family Change-Id: I8e9f5282f15baaab71d6d129aeb9643376b14763

default list

… keep_dim option in fully connected (#9840) * Support -> Change the output shape calculation based on keep_dim option * Support -> Change the output shape calculation based on keep_dim option * Support -> Change the output shape calculation based on keep_dim option * Support -> Change the output shape calculation based on keep_dim option * Change the output shape calculation based on keep_dim option in fully connected * TODO : Need to construct a fc op with (keep_num_dims == True) * TODO : Need to construct a fc op with (keep_num_dims == True)

…(#9880) * encode conditional accesses info into block read/write regions * compare ir after simplify

…t work) (#9898) * fixed int8 dense offload for cublas * support OHWI kernel layout in qnn.conv2d * fixed reduction axis * add cublas int8 qnn test * lint

* fix mix up of channels with conv2d-transpose * add grouped convtranspose tests * turn off groups for non-llvm test

* [Fix] relay onnx frontend bug when [A, B, M, N] * [1, B, N, K] * fix line Co-authored-by: tomoyazhang <tomoyazhang@tencent.com>

* [Caffe Frontend] supporting group > 1 cases for Deconv op - Handling group > 1 cases, assuming group == output channels - Simply decomposed into Relay split, conv2d_transposed, and multi-leveled concatenate ops - Added some test cases Signed-off-by: zotanika <zotanika@gmail.com> * [Caffe Frontend] amending a test case for Deconv op Signed-off-by: zotanika <zotanika@gmail.com> * explicit importing tvm.testing * changing split axis to 0, according to PR #9336

* [Caffe Frontend] adding Reduction op * reformatting Reduction op test script * reformatting Reduction test script * [Caffe frontend] Reduction op - adding more test cases; handling '0 < axis < num_axes - 1' case to give the result equivalent to Caffe framework - skipping Relay multiplication if coeff is 1 Signed-off-by: zotanika <zotanika@gmail.com> * linting test script * linting * [Caffe Frontend] Supporting multiple grouped(channel-wise) Deconv op * Handling group > 1 cases, assuming group == output channels * Decomposed into Relay split, transposed conv, and multi-leveled concatenation. * Added some test cases. Signed-off-by: zotanika <zotanika@gmail.com> * [Caffe Frontend] supporting variable number of inputs for Eltwise * extra handling of rest inputs for PROD, SUM, MAX operations * extra testcases Signed-off-by: zotanika <zotanika@gmail.com> * formatting fix * [Caffe Frontend] reverting codes related Reduction for splitting PR * Revert "[Caffe Frontend] Supporting multiple grouped(channel-wise) Deconv op" This reverts commit 43e25e552b790ce9a38fdbcfb3ddf2075c253e20. * instant fix against docker format error * instant fix against docker format error * instant fix against docker format error

Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org>

* [microNPU] Remove remaining UnsupportedLayout checks In #9508 the decision was made to remove the UnsupportedLayout exception and the checks that throw it, this PR is cleaning up some that remained. Change-Id: I83bfe233381b83af886343c9569db753e33f9059 * fix lint Change-Id: I67c1a5371f0b2e51b6cd39435ef4073d8d17af51

* [microNPU][2c] Initial Performance Model * Added the pre-computed performance modelling per block. * Added the aggregation of cycles given a stripe config. * Implemented the op-specific performance code for conv2d. * Created a DeviceConfig class to hold constant performance related data that is dependent on the accelerator configuration * Added generation of all valid block configs. This is pre-computed and given as an argument when constructing EthosuParts. * Implemented selection of the block config that gives the least amount of data read given a StripeConfig. * Add test guards * Extended block config testing

Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org>

gussmith23 force-pushed the 3la-pldi-push-main branch from dc9ee6d to 17bde2c Compare December 20, 2021 07:27

gussmith23 force-pushed the 3la-pldi-push-main branch 2 times, most recently from f4f8205 to 4b1858b Compare January 1, 2022 21:25

leeexyz and others added 27 commits January 13, 2022 21:19

[Relay] Add printer for op strategy objects (#9923)

8959d68

Add API get_input_info to graph_executor (#9889)

f9d8c2b

[Fix Bug] fix the bug of pool_impl_nd when computing avgpool_nd whith…

cc9d2f4

… ceil_mode and count_include_pad are True. (#9835) * Added the offset[i] for getting the correct boundary * Added corresponding test case

[skip ci] Fix missing pack_lib in Jenkinsfile (#9924)

2c1ed59

This was inadvertently removed by #9554 Co-authored-by: driazati <driazati@users.noreply.github.com>

[Hexagon] Pass SDK information to launcher build for Android (#9902)

5f828a6

std::string -> tvm::String for Conv1DAttrs (#9921)

424821a

This is necessary to make the Rust bindings work.

fix icelake target for avx512 and vnni (#9928)

46da676

Fix HexagonSDK.cmake (#9914)

220b122

Restore the use of ONNX_DEFAULT_CONFIGS["use_nt_batch_matmul"] (#9925)

0a159c4

[CI] Fix pip cache config bug (#9933)

670de9b

* Update Dockerfile.ci_arm * Update Dockerfile.ci_cpu * Update Dockerfile.ci_gpu * Update Dockerfile.ci_i386 * Update Dockerfile.ci_lint * Update Dockerfile.ci_qemu * Update Dockerfile.ci_wasm

dynamic to static use infer_type_local (#9869)

79c59fe

[TEST] Remove llvm -device=arm_cpu and cuda -libs=cudnn from (#9905)

4419241

default list

[TIR] Encode conditional accesses info into block read/write regions …

6f6fc68

…(#9880) * encode conditional accesses info into block read/write regions * compare ir after simplify

[Int8] Support cublas on e2e int8 models (also tried cudnn but doesn'…

b3c6625

…t work) (#9898) * fixed int8 dense offload for cublas * support OHWI kernel layout in qnn.conv2d * fixed reduction axis * add cublas int8 qnn test * lint

remove clang compile warnings (#9942)

1b1cfb3

[ONNX] Fix onnx convtranspose error (#9938)

84ee90c

* fix mix up of channels with conv2d-transpose * add grouped convtranspose tests * turn off groups for non-llvm test

[Fix] relay onnx frontend bug when [A, B, M, N] * [1, B, N, K] (#9911)

6eb4ed8

* [Fix] relay onnx frontend bug when [A, B, M, N] * [1, B, N, K] * fix line Co-authored-by: tomoyazhang <tomoyazhang@tencent.com>

[MetaSchedule] PostProcessor: Verify GPU Code (#9945)

1e5373f

AD1024 and others added 29 commits January 31, 2022 13:03

fix code use alu op number directly

4658a43

[ add ] vta quantization

82e77ff

fix

7a56788

Rust binding updates

fe77da1

StridedSliceAttrs rust bindings

76b3deb

use 16-bit imm

d947453

try to use test_lib.cc code

3fe4e0a

init block matmul

51e8f58

fix

c61affc

add ref print

8270655

try output

cd13d9c

add vta calls

7e131f8

fix calls

ce097b4

use new

9ef9519

try fix

b366b79

try fix

beabf4b

use previous codegen

0b5b021

fix

080c058

fix

187c9a7

try fix segfault

3541210

Add structural hashes of arguments to annotated regions as annotations

0eb349b

Move VTA dependency back to mainline

fce3048

Temporarily remove VTA submodule

0c09d2a

Add back VTA submodule

d5b63fb

Remove things that use our VTA fork and will not work now

1973194

Whoops, remove something that must have been accidentally added

7647138

Remove nonexistent import

f0365ee

comment out something that I think isn't needed

debce2d

remove import

82699db

gussmith23 force-pushed the 3la-pldi-push-main branch from 9d5a9ec to 82699db Compare January 31, 2022 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PLDI work#24

PLDI work#24
gussmith23 wants to merge 2228 commits intomainfrom
3la-pldi-push-main

gussmith23 commented Dec 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

gussmith23 commented Dec 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants