Skip to content
This repository was archived by the owner on Dec 15, 2025. It is now read-only.

PLDI work#24

Draft
gussmith23 wants to merge 2228 commits intomainfrom
3la-pldi-push-main
Draft

PLDI work#24
gussmith23 wants to merge 2228 commits intomainfrom
3la-pldi-push-main

Conversation

@gussmith23
Copy link
Collaborator

DO NOT MERGE

PR for tracking all of the changes we made for PLDI. Useful for seeing the diff.

@gussmith23 gussmith23 force-pushed the 3la-pldi-push-main branch 2 times, most recently from f4f8205 to 4b1858b Compare January 1, 2022 21:25
leeexyz and others added 27 commits January 13, 2022 21:19
… ceil_mode and count_include_pad are True. (#9835)

* Added the offset[i] for getting the correct  boundary
* Added corresponding test case
This was inadvertently removed by #9554

Co-authored-by: driazati <driazati@users.noreply.github.com>
This is necessary to make the Rust bindings work.
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
* Update Dockerfile.ci_arm

* Update Dockerfile.ci_cpu

* Update Dockerfile.ci_gpu

* Update Dockerfile.ci_i386

* Update Dockerfile.ci_lint

* Update Dockerfile.ci_qemu

* Update Dockerfile.ci_wasm
* [USMP] Hill Climb allocator

This PR adds HillClimb allocator "tir.usmp.algo.hill_climb"
to the memory allocation algorithm set.

Change-Id: Ib7485df93757eb512da040528ec86c920db8d03b

* requested changes

Change-Id: I6700a24c1608d92f87be7dde33cc24f5de1f7063

* Conda-related linter small fixes

Change-Id: I0dac5c6d75ade8f813b077c8708aad59d2722933

* Moved implementation from greedy.h to greedy.cc

Change-Id: If8ed159eceef32d3f22b51e0252161d09222eb1e

* Integrated into test_tir_usmp_algo.py unit test

Added "hill_climb" into test_tir_usmp_algo.py
Amended sorting to be consistent with "greedy" family

Change-Id: I8e9f5282f15baaab71d6d129aeb9643376b14763
… keep_dim option in fully connected (#9840)

* Support -> Change the output shape calculation based on keep_dim option

* Support -> Change the output shape calculation based on keep_dim option

* Support -> Change the output shape calculation based on keep_dim option

* Support -> Change the output shape calculation based on keep_dim option

* Change the output shape calculation based on keep_dim option in fully connected

* TODO : Need to construct a fc op with (keep_num_dims == True)

* TODO : Need to construct a fc op with (keep_num_dims == True)
…(#9880)

* encode conditional accesses info into block read/write regions

* compare ir after simplify
…t work) (#9898)

* fixed int8 dense offload for cublas

* support OHWI kernel layout in qnn.conv2d

* fixed reduction axis

* add cublas int8 qnn test

* lint
* fix mix up of channels with conv2d-transpose

* add grouped convtranspose tests

* turn off groups for non-llvm test
* [Fix] relay onnx frontend bug when [A, B, M, N] * [1, B, N, K]

* fix line

Co-authored-by: tomoyazhang <tomoyazhang@tencent.com>
* [Caffe Frontend] supporting group > 1 cases for Deconv op

- Handling group > 1 cases, assuming group == output channels
- Simply decomposed into Relay split, conv2d_transposed, and multi-leveled concatenate ops
- Added some test cases

Signed-off-by: zotanika <zotanika@gmail.com>

* [Caffe Frontend] amending a test case for Deconv op

Signed-off-by: zotanika <zotanika@gmail.com>

* explicit importing tvm.testing

* changing split axis to 0, according to PR #9336
* [Caffe Frontend] adding Reduction op

* reformatting Reduction op test script

* reformatting Reduction test script

* [Caffe frontend] Reduction op
- adding more test cases; handling '0 < axis < num_axes - 1' case to give the result equivalent to Caffe framework
- skipping Relay multiplication if coeff is 1

Signed-off-by: zotanika <zotanika@gmail.com>

* linting test script

* linting

* [Caffe Frontend] Supporting multiple grouped(channel-wise) Deconv op

* Handling group > 1 cases, assuming group == output channels
* Decomposed into Relay split, transposed conv, and multi-leveled concatenation.
* Added some test cases.

Signed-off-by: zotanika <zotanika@gmail.com>

* [Caffe Frontend] supporting variable number of inputs for Eltwise

* extra handling of rest inputs for PROD, SUM, MAX operations
* extra testcases

Signed-off-by: zotanika <zotanika@gmail.com>

* formatting fix

* [Caffe Frontend] reverting codes related Reduction for splitting PR

* Revert "[Caffe Frontend] Supporting multiple grouped(channel-wise) Deconv op"

This reverts commit 43e25e552b790ce9a38fdbcfb3ddf2075c253e20.

* instant fix against docker format error

* instant fix against docker format error

* instant fix against docker format error
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>

Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
* [microNPU] Remove remaining UnsupportedLayout checks

In #9508 the decision was made to remove the UnsupportedLayout exception
and the checks that throw it, this PR is cleaning up some that remained.

Change-Id: I83bfe233381b83af886343c9569db753e33f9059

* fix lint

Change-Id: I67c1a5371f0b2e51b6cd39435ef4073d8d17af51
* [microNPU][2c] Initial Performance Model

* Added the pre-computed performance modelling per block.
* Added the aggregation of cycles given a stripe config.
* Implemented the op-specific performance code for conv2d.
* Created a DeviceConfig class to hold constant performance related data
that is dependent on the accelerator configuration
* Added generation of all valid block configs. This is pre-computed and
given as an argument when constructing EthosuParts.
* Implemented selection of the block config that gives the least amount
of data read given a StripeConfig.

* Add test guards

* Extended block config testing
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>

Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.