-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Optimize Nullstate / accumulators
#19625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
run benchmark tpch |
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
Looks like it is a nice win. |
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmarks |
|
🤖 |
|
Benchmark script failed with exit code 101. Last 10 lines of output: Click to expand |
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
Query 1 is consistently 15%-20% faster with this change. |
|
run benchmark tpch |
|
🤖 |
|
🤖: Benchmark completed Details
|
fb8f6ac to
05414e8
Compare
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is looking pretty cool
| |group_index| { | ||
| self.counts[group_index] += 1; | ||
| // SAFETY: group_index is guaranteed to be in bounds | ||
| let count = unsafe { self.counts.get_unchecked_mut(group_index) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this in one of my PRs too trying to get the inner loop faster, but couldn't seem to get a measurable difference
Maybe we need to add a micro benchmark or something
| /// If `seen_values[i]` is false, have not seen any values that | ||
| /// pass the filter yet for group `i` | ||
| seen_values: BooleanBufferBuilder, | ||
| /// If true, all groups seen so far have seen at least one non-null value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could encode this as an state enum so it is clearer how things are related to BooleamBufferBuilder
Something like
enum SeenValues {
/// All groups seen so far have seen at least one non-null value
All {
num_values: usize,
}
// some groups have not yet seen a non-null value
Some {
values: BooleanBufferBuilder,
}
}
Which issue does this PR close?
Rationale for this change
Speedup accumulator code (sum, avg, count) by specializing on non-null cases.
What changes are included in this PR?
Nullstateto non-null values.Are these changes tested?
Are there any user-facing changes?