-
Notifications
You must be signed in to change notification settings - Fork 69
feat: implement schema selection and projection methods #207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
nullccxsy
commented
Sep 3, 2025
- Added select and project methods to the Schema class for creating projection schemas based on specified field names or IDs.
- Introduced PruneColumnVisitor to handle the logic for selecting and projecting fields, including support for nested structures.
351edfe to
fe87277
Compare
src/iceberg/schema.cc
Outdated
| /// \brief Visitor class for pruning schema columns based on selected field IDs. | ||
| /// | ||
| /// This visitor traverses a schema and creates a projected version containing only | ||
| /// the specified fields. It handles different projection modes: | ||
| /// - select_full_types=true: Include entire fields when their ID is selected | ||
| /// - select_full_types=false: Recursively project nested fields within selected structs | ||
| /// | ||
| /// \warning Error conditions that will cause projection to fail: | ||
| /// - Attempting to explicitly project List or Map types (returns InvalidArgument) | ||
| /// - Projecting a List when element result is null (returns InvalidArgument) | ||
| /// - Projecting a Map without a defined map value type (returns InvalidArgument) | ||
| /// - Projecting a struct when result is not StructType (returns InvalidArgument) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is easy and valid to support projections in the nested map and list types and don't know why the Java impl does not support this. The code will be much simpler (shorter) if we support them.
@Fokko Do you have any context on the Java impl?
1dd1fda to
11126f9
Compare
wgtmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nullccxsy!
- Added select and project methods to the Schema class for creating projection schemas based on specified field names or IDs. - Introduced PruneColumnVisitor to handle the logic for selecting and projecting fields, including support for nested structures.
… handling - Modified the PruneColumnVisitor class to pass results as shared pointers, improving memory management and clarity. - Updated Visit methods for ListType, MapType, and StructType to accommodate the new result handling approach.
…rror reporting - Updated the PruneColumnVisitor class to utilize shared pointers for type results, enhancing memory management. - Refined Visit methods for StructType, ListType, and MapType to improve clarity and error handling, particularly for cases involving invalid projections.
dbb3e92 to
7e7e2a5
Compare
Xuanwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this!