Skip to content

Commit 7ea6713

Browse files
authored
feat(datafusion): implement the partitioning node for DataFusion to define the partitioning (#1620)
## Which issue does this PR close? - Closes #1543 ## What changes are included in this PR? Implement a physical execution repartition node that determines the relevant DataFusion partitioning strategy based on the Iceberg table schema and metadata. 1. Unpartitioned tables: Uses round-robin partitioning 2. Partitioned tables: It depends on the transform type: - Identity or Bucket transforms: Uses hash partitioning on the _partition column - Temporal transforms (Year, Month, Day, Hour): Uses round-robin partitioning _Minor change: I created a new `schema_ref()` helper method._ ## Are these changes tested? Yes, with unit tests --------- Signed-off-by: Florian Valeye <[email protected]>
1 parent 45c82df commit 7ea6713

File tree

3 files changed

+893
-1
lines changed

3 files changed

+893
-1
lines changed

crates/iceberg/src/table.rs

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ use crate::inspect::MetadataTable;
2424
use crate::io::FileIO;
2525
use crate::io::object_cache::ObjectCache;
2626
use crate::scan::TableScanBuilder;
27-
use crate::spec::{TableMetadata, TableMetadataRef};
27+
use crate::spec::{SchemaRef, TableMetadata, TableMetadataRef};
2828
use crate::{Error, ErrorKind, Result, TableIdent};
2929

3030
/// Builder to create table scan.
@@ -235,6 +235,11 @@ impl Table {
235235
self.readonly
236236
}
237237

238+
/// Returns the current schema as a shared reference.
239+
pub fn current_schema_ref(&self) -> SchemaRef {
240+
self.metadata.current_schema().clone()
241+
}
242+
238243
/// Create a reader for the table.
239244
pub fn reader_builder(&self) -> ArrowReaderBuilder {
240245
ArrowReaderBuilder::new(self.file_io.clone())

crates/integrations/datafusion/src/physical_plan/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ pub(crate) mod commit;
1919
pub(crate) mod expr_to_predicate;
2020
pub(crate) mod metadata_scan;
2121
pub(crate) mod project;
22+
pub(crate) mod repartition;
2223
pub(crate) mod scan;
2324
pub(crate) mod write;
2425

0 commit comments

Comments
 (0)