Skip to content

Validate input schema inside project for DataFusion #1752

@fvaleye

Description

@fvaleye

Is your feature request related to a problem or challenge?

In the current DataFusion integration (crates/integrations/datafusion/src/physical_plan/project.rs), input_schema is not validated that is exactly matches iceberg table schema.

Describe the solution you'd like

Implement a helper function in project.rs that recursively visits an Arrow schema and removes metadata from all fields, including nested fields.

Apply this function to both the input Arrow schema and the converted Arrow schema before comparing them using Arrow’s built-in ==.

This will ensure that schema comparisons are reliable and not affected by irrelevant metadata differences, improving input validation in the project node.

Willingness to contribute

I can contribute to this feature independently

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions