Skip to content

Add Partitioning support #11

@shefeek-jinnah

Description

@shefeek-jinnah

Description

Add support for partitioned tables by leveraging directory-based partitioning (e.g. year=2024/month=01/) to prune files during query planning. Partition values are extracted from file paths and used to skip non-matching partitions based on query predicates. Partition columns are materialized as virtual columns without reading Parquet data. This enables significant scan reduction for partitioned datasets and aligns DataFusion–DuckLake with common lakehouse partitioning semantics.

Current behavior: Scans all files regardless of partition filters
Desired behavior: Skip files in non-matching partitions

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions