Skip to content

Commit 6db943f

Browse files
FP32 ReduceMean operator improvement (#137)
This PR fixes the tiling implementation for the PULPOpen FP32 ReduceMean operator ([Issue #134](#134)) and adds parallelization support. It also extends ReduceMean to support ONNX opset >= 18, reorganizes the FP32 TinyViT kernel test suite, and adds extensive ReduceMean test coverage. ## Added - Support for an unknown number of data dimensions in the tiler - Parallelization support for the FP32 ReduceMean operator on PULPOpen - Extensive testing for the ReduceMean operator - Pass to remove ReduceMean operators that don't change data content, but only its shape ## Changed - Structure of Tests subdir for improved ordering - Structure of `.gitignore` file for improved ordering ## Fixed - Fixed ReduceMean parallelization and tiling issues described in Issue [#134](#134).
1 parent ecae48a commit 6db943f

File tree

533 files changed

+948
-485
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

533 files changed

+948
-485
lines changed

.github/workflows/ci-deeploy-testing.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,19 +40,19 @@ jobs:
4040
include:
4141
- name: fail-input0
4242
platform: Generic
43-
test: testTypeInferenceDifferentTypes
43+
test: Others/TypeInference
4444
type_map: "A=int8_t B=int8_t C=int8_t"
4545
offset_map: "A=0 B=0 C=0"
4646
shouldFail: true
4747
- name: fail-input2
4848
platform: Generic
49-
test: testTypeInferenceDifferentTypes
49+
test: Others/TypeInference
5050
type_map: "A=int16_t B=int8_t C=int16_t"
5151
offset_map: "A=0 B=0 C=0"
5252
shouldFail: true
5353
- name: pass
5454
platform: Generic
55-
test: testTypeInferenceDifferentTypes
55+
test: Others/TypeInference
5656
type_map: "A=int16_t B=int8_t C=int32_t"
5757
offset_map: "A=0 B=0 C=0"
5858
shouldFail: false

.github/workflows/ci-deeploy.yml

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -60,10 +60,10 @@ jobs:
6060
shell: bash
6161
run: |
6262
cd DeeployTest
63-
python testMVP.py -t Tests/CCT/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=75000 --memAllocStrategy=MiniMalloc
64-
python testMVP.py -t Tests/CCT/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=60000 --memAllocStrategy=MiniMalloc --shouldFail
65-
python testMVP.py -t Tests/CCT/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=90000 --memAllocStrategy=TetrisRandom
66-
python testMVP.py -t Tests/CCT/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=75000 --memAllocStrategy=TetrisRandom --shouldFail
63+
python testMVP.py -t Tests/Models/CCT/FP32/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=75000 --memAllocStrategy=MiniMalloc
64+
python testMVP.py -t Tests/Models/CCT/FP32/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=60000 --memAllocStrategy=MiniMalloc --shouldFail
65+
python testMVP.py -t Tests/Models/CCT/FP32/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=90000 --memAllocStrategy=TetrisRandom
66+
python testMVP.py -t Tests/Models/CCT/FP32/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=75000 --memAllocStrategy=TetrisRandom --shouldFail
6767
6868
deeploy-state-serialization:
6969
needs: select-env
@@ -82,10 +82,10 @@ jobs:
8282
shell: bash
8383
run: |
8484
cd DeeployTest
85-
python deeployStateEqualityTest.py -t ./Tests/simpleRegression -p QEMU-ARM
86-
python deeployStateEqualityTest.py -t ./Tests/simpleRegression -p Siracusa
87-
python deeployStateEqualityTest.py -t ./Tests/simpleRegression -p MemPool
88-
python deeployStateEqualityTest.py -t ./Tests/simpleRegression -p Generic
85+
python deeployStateEqualityTest.py -t ./Tests/Models/CNN_Linear2 -p QEMU-ARM
86+
python deeployStateEqualityTest.py -t ./Tests/Models/CNN_Linear2 -p Siracusa
87+
python deeployStateEqualityTest.py -t ./Tests/Models/CNN_Linear2 -p MemPool
88+
python deeployStateEqualityTest.py -t ./Tests/Models/CNN_Linear2 -p Generic
8989
9090
deeploy-memory-level-extension:
9191
needs: select-env
@@ -104,10 +104,10 @@ jobs:
104104
shell: bash
105105
run: |
106106
cd DeeployTest
107-
python testMemoryLevelExtension.py -t ./Tests/simpleRegression -p QEMU-ARM
108-
python testMemoryLevelExtension.py -t ./Tests/simpleRegression -p Siracusa
109-
python testMemoryLevelExtension.py -t ./Tests/simpleRegression -p MemPool
110-
python testMemoryLevelExtension.py -t ./Tests/simpleRegression -p Generic
107+
python testMemoryLevelExtension.py -t ./Tests/Models/CNN_Linear2 -p QEMU-ARM
108+
python testMemoryLevelExtension.py -t ./Tests/Models/CNN_Linear2 -p Siracusa
109+
python testMemoryLevelExtension.py -t ./Tests/Models/CNN_Linear2 -p MemPool
110+
python testMemoryLevelExtension.py -t ./Tests/Models/CNN_Linear2 -p Generic
111111
112112
deeploy-tiler-extension:
113113
needs: select-env
@@ -126,14 +126,14 @@ jobs:
126126
shell: bash
127127
run: |
128128
cd DeeployTest
129-
python testTilerExtension.py -p Siracusa -t ./Tests/simpleRegression
130-
python testTilerExtension.py -p Siracusa -t ./Tests/simpleCNN
131-
python testTilerExtension.py -p Siracusa -t ./Tests/testMatMul
132-
python testTilerExtension.py -p Siracusa -t ./Tests/testMaxPool
133-
python testTilerExtension.py -p Siracusa -t ./Tests/simpleRegression --l1 2000 --shouldFail
134-
python testTilerExtension.py -p Siracusa -t ./Tests/simpleCNN --l1 2000 --shouldFail
135-
python testTilerExtension.py -p Siracusa -t ./Tests/testMatMul --l1 2000 --shouldFail
136-
python testTilerExtension.py -p Siracusa -t ./Tests/testMaxPool --l1 2000 --shouldFail
129+
python testTilerExtension.py -p Siracusa -t ./Tests/Models/CNN_Linear2
130+
python testTilerExtension.py -p Siracusa -t ./Tests/Models/CNN_Linear1
131+
python testTilerExtension.py -p Siracusa -t ./Tests/Kernels/Integer/MatMul/Regular
132+
python testTilerExtension.py -p Siracusa -t ./Tests/Kernels/Integer/MaxPool
133+
python testTilerExtension.py -p Siracusa -t ./Tests/Models/CNN_Linear2 --l1 2000 --shouldFail
134+
python testTilerExtension.py -p Siracusa -t ./Tests/Models/CNN_Linear1 --l1 2000 --shouldFail
135+
python testTilerExtension.py -p Siracusa -t ./Tests/Kernels/Integer/MatMul/Regular --l1 2000 --shouldFail
136+
python testTilerExtension.py -p Siracusa -t ./Tests/Kernels/Integer/MaxPool --l1 2000 --shouldFail
137137
138138
deeploy-memory-allocation-extension:
139139
needs: select-env
@@ -152,12 +152,12 @@ jobs:
152152
shell: bash
153153
run: |
154154
cd DeeployTest
155-
python testTilerExtension.py -p Siracusa -t ./Tests/simpleRegression
156-
python testTilerExtension.py -p Siracusa -t ./Tests/simpleCNN
157-
python testTilerExtension.py -p Siracusa -t ./Tests/miniMobileNet
158-
python testTilerExtension.py -p Siracusa -t ./Tests/miniMobileNetv2
159-
python testTilerExtension.py -p Siracusa -t ./Tests/testMatMul
160-
python testTilerExtension.py -p Siracusa -t ./Tests/testMaxPool
155+
python testTilerExtension.py -p Siracusa -t ./Tests/Models/CNN_Linear2
156+
python testTilerExtension.py -p Siracusa -t ./Tests/Models/CNN_Linear1
157+
python testTilerExtension.py -p Siracusa -t ./Tests/Models/miniMobileNet
158+
python testTilerExtension.py -p Siracusa -t ./Tests/Models/miniMobileNetv2
159+
python testTilerExtension.py -p Siracusa -t ./Tests/Kernels/Integer/MatMul/Regular
160+
python testTilerExtension.py -p Siracusa -t ./Tests/Kernels/Integer/MaxPool
161161
162162
deeploy-typing:
163163
needs: select-env
@@ -195,9 +195,9 @@ jobs:
195195
shell: bash
196196
run: |
197197
cd DeeployTest
198-
python testPrintInputOutputTransformation.py -p Generic -t ./Tests/simpleRegression
199-
python testPrintInputOutputTransformation.py -p Siracusa -t ./Tests/simpleRegression
200-
python testDebugPrintPass.py -p Generic -t ./Tests/simpleRegression
198+
python testPrintInputOutputTransformation.py -p Generic -t ./Tests/Models/CNN_Linear2
199+
python testPrintInputOutputTransformation.py -p Siracusa -t ./Tests/Models/CNN_Linear2
200+
python testDebugPrintPass.py -p Generic -t ./Tests/Models/CNN_Linear2
201201
202202
deeploy-regex-matching:
203203
needs: select-env

.github/workflows/ci-platform-chimera.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,6 @@ jobs:
3636
runner: ${{ needs.select-env.outputs.runner }}
3737
docker-image: ${{ needs.select-env.outputs.image }}
3838
test-names: |
39-
Adder
39+
Kernels/Integer/Add/Regular
4040
simulators: |
4141
gvsoc

.github/workflows/ci-platform-cortexm.yml

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -36,17 +36,17 @@ jobs:
3636
runner: ${{ needs.select-env.outputs.runner }}
3737
docker-image: ${{ needs.select-env.outputs.image }}
3838
test-names: |
39-
Adder
40-
MultIO
41-
test1DPad
42-
test2DPad
43-
testMatMul
44-
testMatMulAdd
45-
testMaxPool
46-
testRQConv
47-
testReduceSum
48-
testReduceMean
49-
testSlice
39+
Kernels/Integer/Add/Regular
40+
Kernels/Integer/Add/MultIO
41+
Kernels/Integer/Pad/Regular_1D
42+
Kernels/Integer/Pad/Regular_2D
43+
Kernels/Integer/MatMul/Regular
44+
Kernels/Integer/MatMul/Add
45+
Kernels/Integer/MaxPool
46+
Kernels/Integer/Conv/Regular_2D_RQ
47+
Kernels/Integer/ReduceSum
48+
Kernels/Integer/ReduceMean
49+
Kernels/Integer/Slice
5050
5151
cortexm-models:
5252
needs: select-env
@@ -55,5 +55,5 @@ jobs:
5555
runner: ${{ needs.select-env.outputs.runner }}
5656
docker-image: ${{ needs.select-env.outputs.image }}
5757
test-names: |
58-
simpleRegression
59-
WaveFormer
58+
Models/CNN_Linear2
59+
Models/WaveFormer

.github/workflows/ci-platform-generic.yml

Lines changed: 99 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -36,50 +36,88 @@ jobs:
3636
runner: ${{ needs.select-env.outputs.runner }}
3737
docker-image: ${{ needs.select-env.outputs.image }}
3838
test-names: |
39-
Adder
40-
MultIO
41-
test1DConvolution
42-
test2DConvolution
43-
test1DDWConvolution
44-
test2DDWConvolution
45-
test1DPad
46-
test2DPad
47-
testGEMM
48-
testMatMul
49-
testMatMulAdd
50-
testMaxPool
51-
testRQConv
52-
testRQMatMul
53-
testReduceSum
54-
testReduceMean
55-
testSlice
56-
testRequantizedDWConv
57-
test2DRequantizedConv
58-
iSoftmax
59-
testFloatAdder
60-
testFloatGEMM
61-
testFloat2DConvolution
62-
testFloat2DConvolutionBias
63-
testFloat2DConvolutionZeroBias
64-
testFloatLayerNorm
65-
testFloatDiv
66-
testFloat2DDWConvolution
67-
testFloat2DDWConvolutionBias
68-
testFloat2DDWConvolutionZeroBias
69-
testFloatRelu
70-
testFloatMaxPool
71-
testFloatMatmul
72-
testFloatReshapeWithSkipConnection
73-
testFloatSoftmax
74-
testFloatTranspose
75-
testFloatMul
76-
testFloatPowScalar
77-
testFloatPowVector
78-
testFloatSqrt
79-
testFloatRMSNorm
80-
Quant
81-
Dequant
82-
QuantizedLinear
39+
Kernels/FP32/ReLU
40+
Kernels/FP32/Softmax/Regular
41+
42+
Kernels/FP32/Add/Regular
43+
44+
Kernels/FP32/Conv/DW_2D_Bias
45+
Kernels/FP32/Conv/DW_2D_NoBias
46+
Kernels/FP32/Conv/DW_2D_ZeroValuedBias
47+
48+
Kernels/FP32/Conv/Regular_2D_Bias
49+
Kernels/FP32/Conv/Regular_2D_NoBias
50+
Kernels/FP32/Conv/Regular_2D_ZeroValuedBias
51+
52+
Kernels/FP32/Div
53+
Kernels/FP32/GEMM/Regular
54+
Kernels/FP32/MatMul
55+
Kernels/FP32/MaxPool
56+
Kernels/FP32/Mul
57+
58+
Kernels/FP32/LayerNorm
59+
Kernels/FP32/RMSNorm
60+
61+
Kernels/FP32/Pow/Scalar
62+
Kernels/FP32/Pow/Vector
63+
64+
Kernels/FP32/ReduceMean/KeepDims/Add_ReduceMean
65+
Kernels/FP32/ReduceMean/KeepDims/Add_ReduceMean_Add
66+
Kernels/FP32/ReduceMean/KeepDims/AllAxes
67+
Kernels/FP32/ReduceMean/KeepDims/Axes1_2_3
68+
Kernels/FP32/ReduceMean/KeepDims/Axes1_3
69+
Kernels/FP32/ReduceMean/KeepDims/Axes2_1
70+
Kernels/FP32/ReduceMean/KeepDims/Axis0
71+
Kernels/FP32/ReduceMean/KeepDims/Axis2
72+
Kernels/FP32/ReduceMean/KeepDims/ReduceMean_Add
73+
74+
Kernels/FP32/ReduceMean/NoKeepDims/Add_ReduceMean
75+
Kernels/FP32/ReduceMean/NoKeepDims/Add_ReduceMean_Add
76+
Kernels/FP32/ReduceMean/NoKeepDims/AllAxes
77+
Kernels/FP32/ReduceMean/NoKeepDims/Axes1_2_3
78+
Kernels/FP32/ReduceMean/NoKeepDims/Axes1_3
79+
Kernels/FP32/ReduceMean/NoKeepDims/Axes2_1
80+
Kernels/FP32/ReduceMean/NoKeepDims/Axis0
81+
Kernels/FP32/ReduceMean/NoKeepDims/Axis2
82+
Kernels/FP32/ReduceMean/NoKeepDims/ReduceMean_Add
83+
84+
Kernels/FP32/Reshape/SkipConnection
85+
Kernels/FP32/Sqrt
86+
Kernels/FP32/Transpose
87+
88+
Kernels/Integer/Softmax/Regular
89+
90+
Kernels/Integer/Add/MultIO
91+
Kernels/Integer/Add/Regular
92+
93+
Kernels/Integer/Conv/DW_1D
94+
Kernels/Integer/Conv/Regular_1D
95+
96+
Kernels/Integer/Conv/DW_2D
97+
Kernels/Integer/Conv/Regular_2D
98+
99+
Kernels/Integer/GEMM/Regular
100+
101+
Kernels/Integer/MatMul/Add
102+
Kernels/Integer/MatMul/Regular
103+
104+
Kernels/Integer/MaxPool
105+
106+
Kernels/Integer/Pad/Regular_1D
107+
Kernels/Integer/Pad/Regular_2D
108+
109+
Kernels/Integer/ReduceMean
110+
Kernels/Integer/ReduceSum
111+
Kernels/Integer/Slice
112+
113+
Models/TinyViT/5M/Layers/FP32/ReduceMean
114+
115+
Kernels/Mixed/Dequant
116+
Kernels/Mixed/Quant
117+
Models/Transformer_DeepQuant
118+
Kernels/Integer/Conv/DW_2D_RQ
119+
Kernels/Integer/Conv/Regular_2D_RQ
120+
Kernels/Integer/MatMul/Regular_RQ
83121
84122
generic-models:
85123
needs: select-env
@@ -88,16 +126,20 @@ jobs:
88126
runner: ${{ needs.select-env.outputs.runner }}
89127
docker-image: ${{ needs.select-env.outputs.image }}
90128
test-names: |
91-
simpleRegression
92-
WaveFormer
93-
simpleCNN
94-
ICCT
95-
ICCT_ITA
96-
ICCT_8
97-
ICCT_ITA_8
98-
miniMobileNet
99-
miniMobileNetv2
100-
CCT/CCT_1_16_16_8
101-
CCT/CCT_2_32_32_128_Opset20
102-
testFloatDemoTinyViT
103-
Autoencoder1D
129+
Models/Autoencoder1D
130+
131+
Models/CCT/FP32/CCT_1_16_16_8
132+
Models/CCT/FP32/CCT_2_32_32_128_Opset20
133+
Models/CCT/Int/ICCT
134+
Models/CCT/Int/ICCT_8
135+
Models/CCT/Int/ICCT_ITA
136+
Models/CCT/Int/ICCT_ITA_8
137+
138+
Models/miniMobileNet
139+
Models/miniMobileNetv2
140+
141+
Models/CNN_Linear1
142+
Models/TinyViT/Demo
143+
Models/WaveFormer
144+
145+
Models/CNN_Linear2

0 commit comments

Comments
 (0)