-
Notifications
You must be signed in to change notification settings - Fork 298
feat(datafusion): Implement insert_into
for IcebergTableProvider
#1600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
} else { | ||
self.table.clone() | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me wonder if storing table directly in the IcebergTableProvider
is correct... We could get a stale table if the provider doesn't have a catalog table.
Iceberg-java has a refresh()
interface which uses TableOperation
to refresh metadata. In iceberg-rs we don't have TableOperation
and need to rely on catalog to refresh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a PR to add that functionality (including the refresh): #1297
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @phillipleblanc , thanks for pointing me to your change! Your change makes sense to me, but I was thinking of adding something like this to impl Table
directly
pub async fn refresh(&mut self, catalog: &dyn Catalog) -> Result<Self>
_insert_op: InsertOp, | ||
) -> DFResult<Arc<dyn ExecutionPlan>> { | ||
if !self | ||
.table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we should refresh the table here and every otherself.table
usages in IcebergTableProvider
, but I think we should fix that in a separate PR if needed
Which issue does this PR close?
insert_into
forIcebergTableProvider
#1540What changes are included in this PR?
catalog
toIcebergTableProvider
as optionalIcebergTableProvider::scan
insert_into
forIcebergTableProvider
using write node and commit node for non-partitioned tablesAre these changes tested?
Added tests