-
Notifications
You must be signed in to change notification settings - Fork 305
feat(inspect): refs
#1489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(inspect): refs
#1489
Conversation
/// Returns the iceberg schema of the refs table. | ||
pub fn schema(&self) -> crate::spec::Schema { | ||
let fields = vec![ | ||
NestedField::required(1, "name", Type::Primitive(PrimitiveType::String)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any value in adding docs for each field?
Arc::new(max_snapshot_age_in_ms.finish()), | ||
])?; | ||
|
||
Ok(stream::iter(vec![Ok(batch)]).boxed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect Refs to be particularly large, but is there a reason to not make this an actual stream that can read back the data incrementally (seems like it would be not that much more effort).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I 100% agree, and I think long term the goal is to replace a lot of the pyiceberg impl with the rust one so this is something we should tackle in pyiceberg too... defaulting to lazy iterators should be the default but unfortunately is a user breaking change.
https://github.com/apache/iceberg-python/blob/main/pyiceberg/table/inspect.py
Also related:
Would be great to get some momentum on those
], | ||
min_snapshots_to_keep: PrimitiveArray<Int32> | ||
[ | ||
null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for this and max_snapshot_age_in_ms, it seems like it would be good to have test coverage when the value is present, to differentiate tags from branchs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New to the code base and reviews so please take comments with a grain of salt.
Which issue does this PR close?
Subtask for #823
What changes are included in this PR?
Adds a
refs
table to theinspect
Are these changes tested?
Added a unit test.
I am pretty terrible at rust so far but think it is generally valuable to start implementing things here so we can roll wheels for the python implementation. Hoping to learn!