Commit 545c37f
authored
perf: optimise right for byte access and StringView (#20069)
## Which issue does this PR close?
- Closes #20068.
## Rationale for this change
Similar to issue #19749 and the optimisation of `left` in #19980, it's
worth doing the same for `right`
## What changes are included in this PR?
- Improve efficiency of the function by making fewer memory allocations
and going directly to bytes, based on char boundaries
- Provide a specialisation for StringView with buffer zero-copy
- Use `arrow_array::buffer::make_view` for low-level view manipulation
(we still need to know about a magic constant 12 for a buffer layout)
- Benchmark - up to 90% performance improvement
```
right size=1024/string_array positive n/1024
time: [24.286 µs 24.658 µs 25.087 µs]
change: [−86.881% −86.662% −86.424%] (p = 0.00 < 0.05)
Performance has improved.
right size=1024/string_array negative n/1024
time: [29.996 µs 30.737 µs 31.511 µs]
change: [−89.442% −89.229% −89.003%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
right size=4096/string_array positive n/4096
time: [105.58 µs 109.39 µs 113.51 µs]
change: [−86.119% −85.788% −85.497%] (p = 0.00 < 0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
right size=4096/string_array negative n/4096
time: [136.48 µs 138.34 µs 140.36 µs]
change: [−88.007% −87.848% −87.692%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
right size=1024/string_view_array positive n/1024
time: [25.054 µs 25.500 µs 26.033 µs]
change: [−82.569% −82.285% −81.891%] (p = 0.00 < 0.05)
Performance has improved.
right size=1024/string_view_array negative n/1024
time: [41.281 µs 42.730 µs 44.432 µs]
change: [−73.832% −73.288% −72.716%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
right size=4096/string_view_array positive n/4096
time: [129.38 µs 133.69 µs 137.61 µs]
change: [−79.497% −78.998% −78.581%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
right size=4096/string_view_array negative n/4096
time: [218.16 µs 229.41 µs 243.30 µs]
change: [−65.405% −63.622% −61.515%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) high mild
7 (7.00%) high severe
```
## Are these changes tested?
- Existing unit tests for `right`
- Added more unit tests
- Added bench similar to `right.rs`
- Existing SLTs pass
## Are there any user-facing changes?
No1 parent 1a0c2e0 commit 545c37f
File tree
4 files changed
+354
-48
lines changed- datafusion/functions
- benches
- src/unicode
4 files changed
+354
-48
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
311 | 311 | | |
312 | 312 | | |
313 | 313 | | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
314 | 319 | | |
315 | 320 | | |
316 | 321 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
225 | 225 | | |
226 | 226 | | |
227 | 227 | | |
228 | | - | |
| 228 | + | |
229 | 229 | | |
230 | 230 | | |
231 | 231 | | |
| |||
0 commit comments