perf: add fast path for uniform fill values in array_resize#20617
perf: add fast path for uniform fill values in array_resize#20617lyne7-sc wants to merge 7 commits intoapache:mainfrom
array_resize#20617Conversation
neilconway
left a comment
There was a problem hiding this comment.
Nice work!
Can you check that the SLT tests cover the new fast-path?
|
|
||
| // Fast path: at least one row needs to grow and all rows share | ||
| // the same fill value. | ||
| if is_uniform_fill { |
There was a problem hiding this comment.
The naming is a bit confusing: is_uniform_fill is false if the fill value is uniform but there are no rows that need to grow. How about use_batch_fill or use_bulk_fill or similar?
There was a problem hiding this comment.
I think use_bulk_fill is pretty clear, I renamed it to that, thanks
| if extra > max_extra { | ||
| max_extra = extra; | ||
| } |
There was a problem hiding this comment.
| if extra > max_extra { | |
| max_extra = extra; | |
| } | |
| max_extra = max_extra.max(extra); |
| if start + count > offset_window[1] { | ||
| let extra_count = (start + count - offset_window[1]).to_usize().unwrap(); | ||
| let end = offset_window[1]; | ||
| mutable.extend(0, (start).to_usize().unwrap(), (end).to_usize().unwrap()); |
There was a problem hiding this comment.
(Minor) Why (start)? Parens unnecessary, here and below.
|
|
||
| let mut null_builder = NullBufferBuilder::new(array.len()); | ||
|
|
||
| for (row_index, offset_window) in array.offsets().windows(2).enumerate() { |
There was a problem hiding this comment.
The fast path and slow path are doing very similar work and there's a bunch of duplicate code. I wonder if it would be possible to refactor them -- the key difference is just how the fill itself is done, no?
There was a problem hiding this comment.
yes, the main difference is the fill behavior, and the surrounding control flow is very similar. I’ll take a look to see what can be cleanly refactored there.
There was a problem hiding this comment.
I refactored it to pull the shared loop out of the fast and slow paths.
| let default_element = fill_scalar.to_array_of_size(max_extra)?; | ||
| let default_value_data = default_element.to_data(); | ||
|
|
||
| let capacity = Capacities::Array(original_data.len() + default_value_data.len()); |
There was a problem hiding this comment.
Is this right? default_value_data.len() is the per-row growth, I think we want the total estimated output size.
There was a problem hiding this comment.
thanks for the reminder. I added capacity calculation in the pre-scan and use for the allocation.
|
Thanks for the review @neilconway The existing SLTs already hit the fast path, but I added another array test case to cover the multi-row case more explicitly. |
Which issue does this PR close?
Rationale for this change
array_resizecurrently does extra per-row work when resizing list arrays. This pr optimizes the common fast path where the fill value is uniform.What changes are included in this PR?
array_resizefor uniform fill values.Benchmarks
Are these changes tested?
Yes
Are there any user-facing changes?
No