-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
It is possible to cause a panic in Datafusion on 64-bit machines. Datafusion does not handle the panic caused by the underlying append_value
method in the GenericByteViewBuilder
. (See affected line)
See the To Reproduce section for the recursive concat query.
Few notes/thoughts:
- The panic is reproducible in the latest version of Data Fusion, but it was also there in the previous versions.
- The panic originates in
arrow-rs
as on 64-bit systems,usize
can be up tou64::MAX
, but everything is assumed to beu32::MAX
. The panic is already declared in the doc of theappend_value
method, but never handled by the consumer (DataFusion). Is there a reason for that? - I understand the panic happens in
arrow-rs
, but should datafusion handle the panic coming from Arrow? (e.g. in theappend_value
call ofconcat_elements_utf8view
) to prevent the panic from happening - Should datafusion limit somehow the
append_value
calls to prevent the panic from happening?
cc @comphead
To Reproduce
Sample repository: https://github.com/samueleresca/datafusion-byte-view-builder-issue
- Run on a 64-bit machine.
- Include some dummy data (I attached an example):
ctx.register_parquet("users","./data/users_shorten.parquet", ParquetReadOptions::default()).await?;
- Run a recursive string concatenation query (see query in
main.rs
) - Observe the panic
Expected behavior
- Should Data Fusion handle the panic?
- (maybe) restrictions on the builder view calls from datafusion
- (maybe) arrow-rs handling this more gracefully
Additional context
Truncated panic trace happening on my local machine:
thread 'tokio-runtime-worker' (38043518) panicked at /Users/samuele/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-array-56.2.0/src/builder/generic_bytes_view_builder.rs:310:46:
called `Result::unwrap()` on an `Err` value: TryFromIntError(())
stack backtrace:
0: __rustc::rust_begin_unwind
at /rustc/a454fccb02df9d361f1201b747c01257f58a8b37/library/std/src/panicking.rs:698:5
1: core::panicking::panic_fmt
at /rustc/a454fccb02df9d361f1201b747c01257f58a8b37/library/core/src/panicking.rs:75:14
2: core::result::unwrap_failed
at /rustc/a454fccb02df9d361f1201b747c01257f58a8b37/library/core/src/result.rs:1855:5
3: core::result::Result<T,E>::unwrap
at /Users/samuele/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/result.rs:1226:23
4: arrow_array::builder::generic_bytes_view_builder::GenericByteViewBuilder<T>::append_value
at /Users/samuele/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-array-56.2.0/src/builder/generic_bytes_view_builder.rs:310:46
5: datafusion_physical_expr::expressions::binary::kernels::concat_elements_utf8view
at /Users/samuele/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-50.0.0/src/expressions/binary/kernels.rs:159:20
6: datafusion_physical_expr::expressions::binary::concat_elements
at /Users/samuele/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-50.0.0/src/expressions/binary.rs:1078:40
7: datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_with_resolved_args
at /Users/samuele/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-50.0.0/src/expressions/binary.rs:848:29
8: <datafusion_physical_expr::expressions::binary::BinaryExpr as datafusion_physical_expr_common::physical_expr::PhysicalExpr>::evaluate
at /Users/samuele/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-50.0.0/src/expressions/binary.rs:479:14
alamb and comphead
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working