Skip to content

Commit 9a9e4dd

Browse files
rgehanalamb
andauthored
Add recursive protection on planner's create_physical_expr (#19299)
## Which issue does this PR close? None, as this PR is rather self-describing. ## Rationale for this change We recently encountered stack overflow issues in the physical planner's `create_physical_expr` function, when dealing with deeply nested expression trees. In the rest of the codebase, it seems the `recursive::recursive` attribute is used to prevent this. ## What changes are included in this PR? This PR adds the `recursive::recursive` attribute on `create_physical_expr` whenever the `recursive_protection` feature is enabled. ## Are these changes tested? Yes, but mostly for the sake of manually testing the change. I don't think it should be merged, as it has to be `#[ignore]`d anyway, as the `recursive_protection` is not enabled in tests (which means it would systematically crash). I've left it in this PR however, to make review / manual testing easier. ## Are there any user-facing changes? None. --------- Co-authored-by: Andrew Lamb <[email protected]>
1 parent 9fe9ec7 commit 9a9e4dd

File tree

4 files changed

+38
-1
lines changed

4 files changed

+38
-1
lines changed

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

datafusion/core/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ recursive_protection = [
8787
"datafusion-expr/recursive_protection",
8888
"datafusion-optimizer/recursive_protection",
8989
"datafusion-physical-optimizer/recursive_protection",
90+
"datafusion-physical-expr/recursive_protection",
9091
"datafusion-sql/recursive_protection",
9192
"sqlparser/recursive-protection",
9293
]

datafusion/physical-expr/Cargo.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,9 @@ workspace = true
4040
[lib]
4141
name = "datafusion_physical_expr"
4242

43+
[features]
44+
recursive_protection = ["dep:recursive"]
45+
4346
[dependencies]
4447
ahash = { workspace = true }
4548
arrow = { workspace = true }
@@ -54,6 +57,7 @@ itertools = { workspace = true, features = ["use_std"] }
5457
parking_lot = { workspace = true }
5558
paste = { workspace = true }
5659
petgraph = "0.8.3"
60+
recursive = { workspace = true, optional = true }
5761
tokio = { workspace = true }
5862
half = { workspace = true }
5963

datafusion/physical-expr/src/planner.rs

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,7 @@ use datafusion_expr::{
105105
/// * `e` - The logical expression
106106
/// * `input_dfschema` - The DataFusion schema for the input, used to resolve `Column` references
107107
/// to qualified or unqualified fields by name.
108+
#[cfg_attr(feature = "recursive_protection", recursive::recursive)]
108109
pub fn create_physical_expr(
109110
e: &Expr,
110111
input_dfschema: &DFSchema,
@@ -417,7 +418,7 @@ mod tests {
417418
use arrow::array::{ArrayRef, BooleanArray, RecordBatch, StringArray};
418419
use arrow::datatypes::{DataType, Field};
419420

420-
use datafusion_expr::{col, lit};
421+
use datafusion_expr::{Operator, col, lit};
421422

422423
use super::*;
423424

@@ -445,4 +446,34 @@ mod tests {
445446

446447
Ok(())
447448
}
449+
450+
/// Test that deeply nested expressions do not cause a stack overflow.
451+
///
452+
/// This test only runs when the `recursive_protection` feature is enabled,
453+
/// as it would overflow the stack otherwise.
454+
#[test]
455+
#[cfg_attr(not(feature = "recursive_protection"), ignore)]
456+
fn test_deeply_nested_binary_expr() -> Result<()> {
457+
// Create a deeply nested binary expression tree: ((((a + a) + a) + a) + ... )
458+
// With 1000 levels of nesting, this would overflow the stack without recursion protection.
459+
let depth = 1000;
460+
461+
let mut expr = col("a");
462+
for _ in 0..depth {
463+
expr = Expr::BinaryExpr(BinaryExpr {
464+
left: Box::new(expr),
465+
op: Operator::Plus,
466+
right: Box::new(col("a")),
467+
});
468+
}
469+
470+
let schema = Schema::new(vec![Field::new("a", DataType::Int32, false)]);
471+
let df_schema = DFSchema::try_from(schema)?;
472+
473+
// This should not stack overflow
474+
let _physical_expr =
475+
create_physical_expr(&expr, &df_schema, &ExecutionProps::new())?;
476+
477+
Ok(())
478+
}
448479
}

0 commit comments

Comments
 (0)