-
Notifications
You must be signed in to change notification settings - Fork 1
feat(zb-write): Support write mode in Zonal File #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: internal-main
Are you sure you want to change the base?
Conversation
… generation=None to overwrite
added unit tests for zonal file methods changed zonal_mocks to mock _get_bucket_type instead of mocking two methods
This reverts commit d51c8c3.
42a8eac to
3bf77d9
Compare
3bf77d9 to
9fdcbe0
Compare
ankitaluthra1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding initial comments to get started on resolution, yet to look at zonal_file and zb_hns_utils.
.github/workflows/ci.yml
Outdated
| --log-date-format="%H:%M:%S" \ | ||
| gcsfs/ \ | ||
| --ignore=gcsfs/tests/test_extended_gcsfs.py | ||
| --ignore=gcsfs/tests/test_extended_gcsfs.py \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding ignore on CI, please do the same in tests itself as added in main
reason: fsspec also run all gcsfs tests instead of adding --ignore in all ci rules, should be contained in test itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added check for experimental env variable in the test file
| from .extended_gcsfs import ExtendedGcsFileSystem | ||
| from .extended_gcsfs import upload_chunk as ext_upload_chunk | ||
|
|
||
| if isinstance(fs, ExtendedGcsFileSystem) and isinstance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add clear documentation here whats the new behaviour, condition is added instead of overriding functionality in ExtendedFileSystem to avoid change in imports etc. Same for similar methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
Conditional logic is used here since these methods don't belong to a class, so override was not possible.
| self.grpc_client = None | ||
| self.storage_control_client = None | ||
| self.credential = self.credentials.credentials | ||
| if self.credentials.token == "anon": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add documentation so reader is clear why only anon has to be added differently, something like anon is used tests to bypass credentials
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is separate from writes, does it make sense to seggregate the PRs so this can be merged easily. Also verify if all existing also tests run with experimental true, AFAIK creds was the only issue ?
gcsfs/extended_gcsfs.py
Outdated
|
|
||
| async def upload_chunk(fs, location, data, offset, size, content_type): | ||
| raise NotImplementedError( | ||
| "upload_chunk is not implemented yet for ExtendedGcsFileSystem. Please use write() instead." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upload_chunk is not implemented yet for ExtendedGcsFileSystem. Please use write() instead.
rephrase to ? upload_chunk is not implemented yet for zonal experimental feature. Please use write() instead.
|
|
||
| def _initiate_upload(self): | ||
| """Initiates the upload for Zonal buckets using gRPC.""" | ||
| from gcsfs.extended_gcsfs import initiate_upload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this import inline to avoid cyclic import error ? If yes, is it better to move ZonalFile class inside extended_filesystem.py similar to exisiting core.py ? or seggregating it is more readable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is to avoid cyclic import error. I think it is better to keep the zonal features separated for readability since extended_filesystem will have HNS implementation as well.
|
|
||
| def write(self, data): | ||
| """ | ||
| Writes data using AsyncAppendableObjectWriter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add link for AsyncAppendableObjectWriter
gcsfs/zonal_file.py
Outdated
| def discard(self): | ||
| """Discard is not applicable for Zonal Buckets. Log a warning instead.""" | ||
| logger.warning( | ||
| "Discard is unavailable for Zonal Buckets. \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unavailable change this to not applicable
| from .extended_gcsfs import ExtendedGcsFileSystem | ||
| from .extended_gcsfs import initiate_upload as ext_initiate_upload | ||
|
|
||
| if isinstance(fs, ExtendedGcsFileSystem) and await fs._is_zonal_bucket(bucket): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isinstance needed here? isnt is_zonal check sufficient to trigger new functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_zonal_bucket method belongs to ExtendedGcsFileSystem. So the first check is there to make sure is_zonal_bucket method is found in the class.
ankitaluthra1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/gcbrun
816e73b to
95dd97b
Compare
Key Changes