-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
When computing partition_statistics during evalutaion - flamegraph shows a lot of time spend in bounds_check() which happens as part of a Column::data_type() call.

Almost all of the time in bounds_check() is also spend in fmt() which suggests that this goes into the error branch:
impl Column {
fn bounds_check(&self, input_schema: &Schema) -> Result<()> {
if self.index < input_schema.fields.len() {
Ok(())
} else {
internal_err!(
"PhysicalExpr Column references column '{}' at index {} (zero-based) but input schema only has {} columns: {:?}",
self.name,
self.index,
input_schema.fields.len(),
input_schema.fields().iter().map(|f| f.name()).collect::<Vec<_>>()
)
}
}
}
All occurrences that I hand checked from my example were originating from ProjectionExec::partition_statistics()
To Reproduce
Run with RUST_BACKTRACE enabled.
Expected behavior
data_type() method should not trigger bounds_check() to go to an error path for the column.
Additional context
No response
alamb
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working