Commit d48cd25
hyun gyu kim
[TIR][Schedule] FuseReductionEpilogue: Add Clipping pattern support
Currently, the FuseReductionEpilogue primitive only supports Bias
(addition) and BiasReLU (addition + ReLU) epilogue patterns. However,
clipping operations (min(max(x, lower), upper)) are commonly used in
deep learning models and would benefit from the same fusion optimization.
This commit extends FuseReductionEpilogue to support Clipping patterns
by:
1. Adding EpilogueType::Clipping to the enum to distinguish clipping
patterns from other epilogue types.
2. Adding clipping_lower_ and clipping_upper_ members to
ReductionEpilogueFuser to store clipping bounds extracted from the
epilogue pattern.
3. Extending AnalyzeEpiloguePattern to detect clipping patterns:
- min(max(temp, lower), upper)
- max(min(temp, upper), lower)
- All commutative variants of min/max at each level
4. Updating BiasReLU pattern matching to handle max(0, x) form in
addition to max(x, 0) for better commutativity support.
5. Modifying CreateFusedReductionBlock to apply clipping to the init
value: init = min(max(0, lower), upper)
6. Updating BufferReplacer to apply clipping per-iteration:
value = min(max(value, lower), upper)
7. Adding validation in BodyPatternAllowFusion to ensure temp appears
exactly once in clipping patterns.
8. Creating comprehensive test coverage with 8 test cases:
- Basic fusion test
- Numerical correctness verification
- Multiple epilogue blocks test
- 5 commutative variant tests
This implementation follows the same per-iteration semantics as BiasReLU,
where clipping is applied at each reduction step rather than
post-reduction. This semantic change is documented in the docstring with
a warning about potential numerical differences.
The test suite verifies that all commutative forms of clipping patterns
are correctly recognized and that the fused implementation produces
numerically identical results to the per-iteration reference
implementation.1 parent dc296e6 commit d48cd25
File tree
6 files changed
+431
-19
lines changed- 3rdparty
- ffi/3rdparty
- python/tvm/tir/schedule
- src/tir/schedule/primitive
- tests/python/tir-schedule
6 files changed
+431
-19
lines changed- .github/workflows/torch_c_dlpack.yml-140
- .github/workflows/torch_c_dlpack_windows.yml-80
- CMakeLists.txt+1-2
- addons/torch_c_dlpack_ext/README.md-29
- addons/torch_c_dlpack_ext/build_aot_wheels.bat-109
- addons/torch_c_dlpack_ext/build_aot_wheels.sh-135
- addons/torch_c_dlpack_ext/build_backend.py+3-13
- addons/torch_c_dlpack_ext/pyproject.toml+2-2
- addons/torch_c_dlpack_ext/torch_c_dlpack_ext/__init__.py+1-1
- docs/conf.py+1-1
- docs/index.rst-7
- docs/reference/python/index.rst+2-3
- include/tvm/ffi/c_api.h+4-4
- include/tvm/ffi/container/container_details.h+6-6
- include/tvm/ffi/container/tuple.h+1-71
- include/tvm/ffi/error.h+6-14
- include/tvm/ffi/function_details.h+10-27
- pyproject.toml+2-2
- python/tvm_ffi/__init__.py+14
- python/tvm_ffi/_convert.py+2-21
- python/tvm_ffi/_dtype.py+6-8
- python/tvm_ffi/_ffi_api.py-1
- python/tvm_ffi/_optional_torch_c_dlpack.py+11-22
- python/tvm_ffi/access_path.py+1-1
- python/tvm_ffi/container.py+1-1
- python/tvm_ffi/core.pyi+2-5
- python/tvm_ffi/cpp/__init__.py+1-8
- python/tvm_ffi/cpp/load_inline.py+64-421
- python/tvm_ffi/cython/base.pxi+1-7
- python/tvm_ffi/cython/core.pyx+2-14
- python/tvm_ffi/cython/device.pxi+2
- python/tvm_ffi/cython/dtype.pxi+30-33
- python/tvm_ffi/cython/error.pxi+3
- python/tvm_ffi/cython/function.pxi+22-110
- python/tvm_ffi/cython/object.pxi+4-21
- python/tvm_ffi/cython/string.pxi+7
- python/tvm_ffi/cython/tensor.pxi+27-45
- python/tvm_ffi/error.py-1
- python/tvm_ffi/registry.py+6-22
- python/tvm_ffi/utils/_build_optional_torch_c_dlpack.py+24-75
- python/tvm_ffi/utils/lockfile.py+6-35
- src/ffi/dtype.cc+7-6
- src/ffi/extra/env_context.cc+1-3
- src/ffi/object.cc+3-25
- tests/cpp/extra/test_c_env_api.cc+4-10
- tests/cpp/test_error.cc+15-17
- tests/cpp/test_function.cc-16
- tests/cpp/test_tuple.cc-67
- tests/python/test_build.cc-39
- tests/python/test_build.h-25
- tests/python/test_build.py-41
- tests/python/test_dtype.py-15
- tests/python/test_function.py-39
- tests/python/test_metadata.py+1-5
- tests/python/test_object.py+1-88
- tests/python/test_optional_torch_c_dlpack.py+1-8
- tests/python/test_stream.py-19
- tests/python/test_tensor.py-21
- tests/python/utils/filelock_worker.py-49
- tests/python/utils/test_filelock.py-126
- tests/scripts/benchmark_dlpack.py+11-24
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2356 | 2356 | | |
2357 | 2357 | | |
2358 | 2358 | | |
2359 | | - | |
| 2359 | + | |
| 2360 | + | |
| 2361 | + | |
| 2362 | + | |
| 2363 | + | |
| 2364 | + | |
| 2365 | + | |
| 2366 | + | |
| 2367 | + | |
| 2368 | + | |
| 2369 | + | |
| 2370 | + | |
| 2371 | + | |
| 2372 | + | |
| 2373 | + | |
| 2374 | + | |
| 2375 | + | |
| 2376 | + | |
| 2377 | + | |
| 2378 | + | |
| 2379 | + | |
| 2380 | + | |
| 2381 | + | |
| 2382 | + | |
2360 | 2383 | | |
2361 | 2384 | | |
2362 | 2385 | | |
2363 | 2386 | | |
2364 | 2387 | | |
2365 | 2388 | | |
2366 | | - | |
| 2389 | + | |
| 2390 | + | |
| 2391 | + | |
| 2392 | + | |
| 2393 | + | |
2367 | 2394 | | |
2368 | 2395 | | |
2369 | 2396 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
992 | 992 | | |
993 | 993 | | |
994 | 994 | | |
| 995 | + | |
995 | 996 | | |
996 | 997 | | |
997 | 998 | | |
| |||
1058 | 1059 | | |
1059 | 1060 | | |
1060 | 1061 | | |
| 1062 | + | |
| 1063 | + | |
1061 | 1064 | | |
1062 | 1065 | | |
1063 | 1066 | | |
| |||
1080 | 1083 | | |
1081 | 1084 | | |
1082 | 1085 | | |
1083 | | - | |
| 1086 | + | |
| 1087 | + | |
1084 | 1088 | | |
1085 | | - | |
| 1089 | + | |
1086 | 1090 | | |
1087 | 1091 | | |
1088 | 1092 | | |
1089 | | - | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
1090 | 1102 | | |
1091 | 1103 | | |
1092 | 1104 | | |
1093 | 1105 | | |
1094 | 1106 | | |
1095 | | - | |
| 1107 | + | |
1096 | 1108 | | |
1097 | 1109 | | |
1098 | 1110 | | |
| |||
1115 | 1127 | | |
1116 | 1128 | | |
1117 | 1129 | | |
1118 | | - | |
| 1130 | + | |
| 1131 | + | |
| 1132 | + | |
| 1133 | + | |
| 1134 | + | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
| 1142 | + | |
| 1143 | + | |
| 1144 | + | |
| 1145 | + | |
| 1146 | + | |
| 1147 | + | |
| 1148 | + | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
| 1155 | + | |
| 1156 | + | |
| 1157 | + | |
| 1158 | + | |
| 1159 | + | |
| 1160 | + | |
| 1161 | + | |
| 1162 | + | |
| 1163 | + | |
| 1164 | + | |
| 1165 | + | |
| 1166 | + | |
| 1167 | + | |
| 1168 | + | |
| 1169 | + | |
| 1170 | + | |
| 1171 | + | |
| 1172 | + | |
| 1173 | + | |
1119 | 1174 | | |
1120 | | - | |
1121 | | - | |
| 1175 | + | |
| 1176 | + | |
| 1177 | + | |
| 1178 | + | |
| 1179 | + | |
| 1180 | + | |
| 1181 | + | |
| 1182 | + | |
| 1183 | + | |
| 1184 | + | |
| 1185 | + | |
| 1186 | + | |
| 1187 | + | |
| 1188 | + | |
| 1189 | + | |
| 1190 | + | |
| 1191 | + | |
| 1192 | + | |
| 1193 | + | |
| 1194 | + | |
| 1195 | + | |
| 1196 | + | |
| 1197 | + | |
| 1198 | + | |
| 1199 | + | |
1122 | 1200 | | |
1123 | | - | |
| 1201 | + | |
| 1202 | + | |
| 1203 | + | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
| 1207 | + | |
| 1208 | + | |
| 1209 | + | |
| 1210 | + | |
| 1211 | + | |
| 1212 | + | |
| 1213 | + | |
| 1214 | + | |
1124 | 1215 | | |
1125 | | - | |
1126 | | - | |
| 1216 | + | |
1127 | 1217 | | |
1128 | | - | |
1129 | | - | |
1130 | | - | |
| 1218 | + | |
| 1219 | + | |
| 1220 | + | |
1131 | 1221 | | |
1132 | 1222 | | |
1133 | 1223 | | |
| |||
1218 | 1308 | | |
1219 | 1309 | | |
1220 | 1310 | | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
| 1314 | + | |
| 1315 | + | |
| 1316 | + | |
| 1317 | + | |
| 1318 | + | |
1221 | 1319 | | |
1222 | 1320 | | |
1223 | 1321 | | |
| |||
1228 | 1326 | | |
1229 | 1327 | | |
1230 | 1328 | | |
1231 | | - | |
| 1329 | + | |
| 1330 | + | |
1232 | 1331 | | |
1233 | 1332 | | |
1234 | 1333 | | |
1235 | | - | |
| 1334 | + | |
| 1335 | + | |
| 1336 | + | |
1236 | 1337 | | |
1237 | 1338 | | |
1238 | 1339 | | |
| |||
1242 | 1343 | | |
1243 | 1344 | | |
1244 | 1345 | | |
| 1346 | + | |
| 1347 | + | |
| 1348 | + | |
1245 | 1349 | | |
1246 | 1350 | | |
1247 | 1351 | | |
| |||
1261 | 1365 | | |
1262 | 1366 | | |
1263 | 1367 | | |
| 1368 | + | |
| 1369 | + | |
1264 | 1370 | | |
1265 | 1371 | | |
1266 | 1372 | | |
1267 | | - | |
| 1373 | + | |
| 1374 | + | |
| 1375 | + | |
| 1376 | + | |
| 1377 | + | |
| 1378 | + | |
1268 | 1379 | | |
1269 | 1380 | | |
1270 | 1381 | | |
| |||
0 commit comments