Commit dab8130
[vec128] Fix fmsub NEON defintion (pytorch#153093)
[vec128] Fix fmsub NEON defintion (pytorch#152075)
As reported in pytorch#149292, according to manual, `vfmsq_f32` implements `c - a * b` rather than `a * b - c`, so it's call must be prefixed with `vnegq_f32`
Also, adjust the tests to use OpMath for FMA computation to avoid accuracy error accumulation due to non-fused multiply-and-add over lower precision dtypes
Note that `Vectorized::fmsub` is not currently instantiated anywhere, so it could safely remain broken
TODO:
- Enable C++ testing on MacOS and/or aarch64 platforms (right now Mac tests are build without C++ tests)
Fixes pytorch#149292
Pull Request resolved: pytorch#152075
Approved by: https://github.com/swolchok
ghstack dependencies: pytorch#151955
(cherry picked from commit 2ea8653)
Co-authored-by: Nikita Shulga <[email protected]>1 parent 20d62a8 commit dab8130
File tree
3 files changed
+18
-6
lines changed- aten/src/ATen
- cpu/vec/vec128
- test
3 files changed
+18
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
540 | 540 | | |
541 | 541 | | |
542 | 542 | | |
543 | | - | |
| 543 | + | |
544 | 544 | | |
545 | 545 | | |
546 | 546 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
582 | 582 | | |
583 | 583 | | |
584 | 584 | | |
585 | | - | |
| 585 | + | |
586 | 586 | | |
587 | 587 | | |
588 | 588 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
67 | 77 | | |
68 | 78 | | |
69 | 79 | | |
| |||
1279 | 1289 | | |
1280 | 1290 | | |
1281 | 1291 | | |
1282 | | - | |
1283 | | - | |
| 1292 | + | |
| 1293 | + | |
| 1294 | + | |
1284 | 1295 | | |
1285 | 1296 | | |
1286 | 1297 | | |
1287 | 1298 | | |
1288 | 1299 | | |
1289 | | - | |
1290 | | - | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
1291 | 1303 | | |
1292 | 1304 | | |
1293 | 1305 | | |
| |||
0 commit comments