Commit 5ee9787
[libc][math] Improve the performance of sqrtf128. (llvm#122578)
Use a combination of polynomial approximation and Newton-Raphson
iterations in 64-bit and 128-bit integers to improve the performance of
sqrtf128. The correct rounding is provided by squaring the result and
comparing it with the argument.
Performance improvement using the newly added perf test:
- My function = the improved implementation from this PR
- Other function = current implementation using
`libc/src/__support/FPUtil/generic/sqrt.h`
```
Performance tests with inputs in denormal range:
-- My function --
Total time : 1260765265 ns
Average runtime : 125.951 ns/op
Ops per second : 7939623 op/s
-- Other function --
Total time : 7160726518 ns
Average runtime : 715.357 ns/op
Ops per second : 1397902 op/s
-- Average runtime ratio --
Mine / Other's : 0.176067
Performance tests with inputs in normal range:
-- My function --
Total time : 373003808 ns
Average runtime : 37.2631 ns/op
Ops per second : 26836189 op/s
-- Other function --
Total time : 7353398916 ns
Average runtime : 734.605 ns/op
Ops per second : 1361275 op/s
-- Average runtime ratio --
Mine / Other's : 0.0507254
```
---------
Co-authored-by: Alexei Sibidanov <[email protected]>1 parent 4ec1990 commit 5ee9787
File tree
17 files changed
+648
-38
lines changed- libc
- src
- __support
- math/generic
- test
- UnitTest
- src/math
- performance_testing
- smoke
17 files changed
+648
-38
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
241 | 241 | | |
242 | 242 | | |
243 | 243 | | |
244 | | - | |
| 244 | + | |
245 | 245 | | |
246 | 246 | | |
247 | 247 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2978 | 2978 | | |
2979 | 2979 | | |
2980 | 2980 | | |
| 2981 | + | |
| 2982 | + | |
| 2983 | + | |
| 2984 | + | |
| 2985 | + | |
2981 | 2986 | | |
2982 | 2987 | | |
2983 | 2988 | | |
| |||
0 commit comments