-
Notifications
You must be signed in to change notification settings - Fork 0
Fs props #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fs props #17
Conversation
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for passing arbitrary properties to underlying PyArrow filesystems, enhancing configurability. The code is generally well-structured with new helper methods and comprehensive tests. I've identified one critical bug in the GCS filesystem initialization logic, a high-severity issue with handling properties containing hyphens which contradicts the documentation, and several medium-severity performance improvements. Addressing these points will make this a solid contribution.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
for more information, see https://pre-commit.ci
|
/gemini review |
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
|
/gemini review |
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature to pass-through properties to the underlying PyArrow filesystems, enhancing configuration flexibility. The implementation is well-structured, refactoring the filesystem initialization logic into a more maintainable and extensible pattern. The documentation and tests have been updated accordingly.
My review focuses on a few areas for improvement:
- A minor formatting issue in the documentation's markdown table.
- A recurring performance optimization in the property handling logic within
pyiceberg/io/pyarrow.py, where iterating over a pre-filtered dictionary would be more efficient.
Overall, this is a solid contribution that improves the library's usability.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The code changes introduce the ability to pass through arbitrary properties to the underlying PyArrow filesystems, enhancing the flexibility of filesystem configuration in PyIceberg. The implementation is well-structured and accompanied by a comprehensive set of tests. The review focuses on improving efficiency in the _initialize_*_fs methods and fixing a minor formatting issue in the configuration documentation.
Rationale for this change
https://arrow.apache.org/docs/python/generated/pyarrow.fs.S3FileSystem.html
https://arrow.apache.org/docs/python/generated/pyarrow.fs.AzureFileSystem.html
https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html
https://arrow.apache.org/docs/python/generated/pyarrow.fs.GcsFileSystem.html
https://arrow.apache.org/docs/python/generated/pyarrow.fs.LocalFileSystem.html
https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/util/PropertyUtil.java#L153-L192
Are these changes tested?
Are there any user-facing changes?