Commit fbe1530
Bug fix and optimisation for persistent reduction kernel tuning (#2596)
Original PR (#2417) had incorrect
indentation. Updated PR such that autotune will always add tiny configs,
otherwise use the hinted configs only.
Tested locally on test_torchinductor:
Ran 894 tests in 952.242s
FAILED (failures=1, skipped=28)
And completed autotune runs for microbench models
Microbenchmark for network : resnet152
Num devices: 1
Dtype: FP32
Mini batch size [img] : 64
Time per mini-batch : 0.09107530117034912
Throughput [img/sec] : 702.7152167226226
(cherry picked from commit db3ba66)1 parent 345bdb5 commit fbe1530
1 file changed
+14
-14
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2909 | 2909 | | |
2910 | 2910 | | |
2911 | 2911 | | |
2912 | | - | |
2913 | | - | |
2914 | | - | |
2915 | | - | |
2916 | | - | |
2917 | | - | |
2918 | | - | |
2919 | | - | |
2920 | | - | |
2921 | | - | |
2922 | | - | |
2923 | | - | |
2924 | | - | |
2925 | | - | |
| 2912 | + | |
| 2913 | + | |
| 2914 | + | |
| 2915 | + | |
| 2916 | + | |
| 2917 | + | |
| 2918 | + | |
| 2919 | + | |
| 2920 | + | |
| 2921 | + | |
| 2922 | + | |
| 2923 | + | |
| 2924 | + | |
| 2925 | + | |
2926 | 2926 | | |
2927 | 2927 | | |
2928 | 2928 | | |
| |||
0 commit comments