-
Notifications
You must be signed in to change notification settings - Fork 335
Description
What's the feature are you trying to implement?
Apache DataFusion Comet is an Apache Spark accelerator with Apache Iceberg support. We would like to enhance that support by leveraging Iceberg-Rust. You can find the details of this effort in the POC PR apache/datafusion-comet#2528 and in slides presented at the 10/9/25 Iceberg-Rust community call.
The short version is that Comet will rely on Apache Iceberg's Java integration with Apache Spark for planning, and then pass those generated FileScanTask
s to Iceberg-Rust via a new DataFusion IcebergScan
operator in Comet. We need a lot of new (or just public) APIs in the ArrowReader
since we are bypassing the Table
interface to avoid redundant (and possibly incorrect partitioned) planning. I will start to accumulate those efforts here.
- Make
ArrowReaderBuilder::new
pub
instead ofpub(crate)
. - Expose decryption options in
ArrowReaderBuilder
. This likely requires a new Iceberg-Rust Cargo feature like in DataFusion to enable theencryption
feature for the Parquet crate. - Expose
ArrowReaderOptions
inArrowReaderBuilder
.
Willingness to contribute
I can contribute to this feature independently