Skip to content

perf: add fast path for uniform fill values in array_resize#20617

Open
lyne7-sc wants to merge 7 commits intoapache:mainfrom
lyne7-sc:perf/array_resize
Open

perf: add fast path for uniform fill values in array_resize#20617
lyne7-sc wants to merge 7 commits intoapache:mainfrom
lyne7-sc:perf/array_resize

Conversation

@lyne7-sc
Copy link
Contributor

Which issue does this PR close?

  • Closes #.

Rationale for this change

array_resize currently does extra per-row work when resizing list arrays. This pr optimizes the common fast path where the fill value is uniform.

What changes are included in this PR?

  • Add a fast path in array_resize for uniform fill values.
  • Precompute the maximum required growth and reuse a single fill buffer in the uniform-fill path.

Benchmarks

group                                                main                                   optimized
-----                                                ----                                   ---------
array_resize_i64/grow_default_null_fill_10_to_500    11.24     3.2±0.07ms        ? ?/sec    1.00   283.9±20.10µs        ? ?/sec
array_resize_i64/grow_uniform_fill_10_to_500         4.15  1648.0±38.12µs        ? ?/sec    1.00   397.2±20.46µs        ? ?/sec
array_resize_i64/grow_variable_fill_10_to_500        1.00  1667.7±50.31µs        ? ?/sec    1.02  1692.8±61.54µs        ? ?/sec
array_resize_i64/mixed_grow_shrink_1000x_100         4.83    373.0±8.91µs        ? ?/sec    1.00     77.1±5.90µs        ? ?/sec
array_resize_i64/shrink_uniform_fill_500_to_10       1.00      8.1±0.51µs        ? ?/sec    1.06      8.5±0.50µs        ? ?/sec

Are these changes tested?

Yes

Are there any user-facing changes?

No

@github-actions github-actions bot added the functions Changes to functions implementation label Feb 28, 2026
Copy link
Contributor

@neilconway neilconway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Can you check that the SLT tests cover the new fast-path?


// Fast path: at least one row needs to grow and all rows share
// the same fill value.
if is_uniform_fill {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming is a bit confusing: is_uniform_fill is false if the fill value is uniform but there are no rows that need to grow. How about use_batch_fill or use_bulk_fill or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think use_bulk_fill is pretty clear, I renamed it to that, thanks

Comment on lines +222 to +224
if extra > max_extra {
max_extra = extra;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if extra > max_extra {
max_extra = extra;
}
max_extra = max_extra.max(extra);

if start + count > offset_window[1] {
let extra_count = (start + count - offset_window[1]).to_usize().unwrap();
let end = offset_window[1];
mutable.extend(0, (start).to_usize().unwrap(), (end).to_usize().unwrap());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Minor) Why (start)? Parens unnecessary, here and below.


let mut null_builder = NullBufferBuilder::new(array.len());

for (row_index, offset_window) in array.offsets().windows(2).enumerate() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fast path and slow path are doing very similar work and there's a bunch of duplicate code. I wonder if it would be possible to refactor them -- the key difference is just how the fill itself is done, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the main difference is the fill behavior, and the surrounding control flow is very similar. I’ll take a look to see what can be cleanly refactored there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored it to pull the shared loop out of the fast and slow paths.

let default_element = fill_scalar.to_array_of_size(max_extra)?;
let default_value_data = default_element.to_data();

let capacity = Capacities::Array(original_data.len() + default_value_data.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? default_value_data.len() is the per-row growth, I think we want the total estimated output size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the reminder. I added capacity calculation in the pre-scan and use for the allocation.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 1, 2026
@lyne7-sc
Copy link
Contributor Author

lyne7-sc commented Mar 1, 2026

Thanks for the review @neilconway

The existing SLTs already hit the fast path, but I added another array test case to cover the multi-row case more explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants