-
Notifications
You must be signed in to change notification settings - Fork 70
feat: add table metadata definition #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5871171 to
7dcadb2
Compare
7dcadb2 to
b76dc8e
Compare
b76dc8e to
f8ea501
Compare
| /// | ||
| /// TODO(wgtmac): Implement Equals and ToString once SortOrder and Snapshot are | ||
| /// implemented. | ||
| struct ICEBERG_EXPORT TableMetadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to mark some fields as optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we need but I'm still unclear which one should change. I will postpone this decision until we implement the json serialization from/to different table format versions.
src/iceberg/table_metadata.h
Outdated
| /// whether or not to track the creation and updates to rows in the table | ||
| bool row_lineage_enabled = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Row lineage will always be enabled from V3 onwards: apache/iceberg#12593 So I think we can remove this 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads-up! I found that the Java impl still has this field which looks weird to me.
zhjwpku
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| #include <format> | ||
| #include <string> | ||
|
|
||
| #include "iceberg/statistics_file.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why this include is here and not in table_metadata.h where we seem to require the following?
std::vector<std::shared_ptr<struct StatisticsFile>> statistics;
/// A list of partition statistics
std::vector<std::shared_ptr<struct PartitionStatisticsFile>> partition_statistics;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to use forward declaration as much as possible. The implementation is not yet complete due to missing the concrete implementation of Snapshot and other classes so it looks weird that iceberg/statistics_file.h is included but not used at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but as those would still be part of the table metadata API wouldn't it be better to be part of the header file? I am learning C++, I just thought it would still be better there even when we add the implementation part but I might be wrong :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a common practice to use forward declaration to speed up compilation if the implementation detail is not required in the header file, though I suspect that modern compilers are smart enough to optimize this.
No description provided.