Skip to content

Conversation

@suni72
Copy link
Collaborator

@suni72 suni72 commented Nov 21, 2025

Key Changes

  • _ init_: Updated to initialize AsyncAppendableObjectWriter (AAOW) when the file mode is write-enabled ('w' mode).
  • _init_aaow: Added a helper method to initialize the AAOW.
  • write: Overriden to write data to gcs using AAOW.
  • flush: Overridden to flush data from AAOW instead of initiating a Multi-Part Upload (MPU).
  • close: Overridden to ensure the AAOW is closed properly.
    • If autocommit=True (default), the object is finalized on close.
    • If autocommit=False, the object remains appendable.
  • discard: Reimplemented to log a warning ("Discard is unavailable for Zonal Buckets..."), as Bidi streams cannot be discarded/cancelled once written.
  • Test: Added unit tests for new methods in zb_hns_utils and zonal_file
  • Fix: Fixed auth issue with grpc client when using anon token

Copy link
Owner

@ankitaluthra1 ankitaluthra1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding initial comments to get started on resolution, yet to look at zonal_file and zb_hns_utils.

--log-date-format="%H:%M:%S" \
gcsfs/ \
--ignore=gcsfs/tests/test_extended_gcsfs.py
--ignore=gcsfs/tests/test_extended_gcsfs.py \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding ignore on CI, please do the same in tests itself as added in main
reason: fsspec also run all gcsfs tests instead of adding --ignore in all ci rules, should be contained in test itself

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added check for experimental env variable in the test file

from .extended_gcsfs import ExtendedGcsFileSystem
from .extended_gcsfs import upload_chunk as ext_upload_chunk

if isinstance(fs, ExtendedGcsFileSystem) and isinstance(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add clear documentation here whats the new behaviour, condition is added instead of overriding functionality in ExtendedFileSystem to avoid change in imports etc. Same for similar methods

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.
Conditional logic is used here since these methods don't belong to a class, so override was not possible.

self.grpc_client = None
self.storage_control_client = None
self.credential = self.credentials.credentials
if self.credentials.token == "anon":
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets add documentation so reader is clear why only anon has to be added differently, something like anon is used tests to bypass credentials

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is separate from writes, does it make sense to seggregate the PRs so this can be merged easily. Also verify if all existing also tests run with experimental true, AFAIK creds was the only issue ?


async def upload_chunk(fs, location, data, offset, size, content_type):
raise NotImplementedError(
"upload_chunk is not implemented yet for ExtendedGcsFileSystem. Please use write() instead."
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upload_chunk is not implemented yet for ExtendedGcsFileSystem. Please use write() instead.
rephrase to ? upload_chunk is not implemented yet for zonal experimental feature. Please use write() instead.


def _initiate_upload(self):
"""Initiates the upload for Zonal buckets using gRPC."""
from gcsfs.extended_gcsfs import initiate_upload
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this import inline to avoid cyclic import error ? If yes, is it better to move ZonalFile class inside extended_filesystem.py similar to exisiting core.py ? or seggregating it is more readable ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is to avoid cyclic import error. I think it is better to keep the zonal features separated for readability since extended_filesystem will have HNS implementation as well.


def write(self, data):
"""
Writes data using AsyncAppendableObjectWriter.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add link for AsyncAppendableObjectWriter

def discard(self):
"""Discard is not applicable for Zonal Buckets. Log a warning instead."""
logger.warning(
"Discard is unavailable for Zonal Buckets. \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unavailable change this to not applicable

from .extended_gcsfs import ExtendedGcsFileSystem
from .extended_gcsfs import initiate_upload as ext_initiate_upload

if isinstance(fs, ExtendedGcsFileSystem) and await fs._is_zonal_bucket(bucket):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isinstance needed here? isnt is_zonal check sufficient to trigger new functionality

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_zonal_bucket method belongs to ExtendedGcsFileSystem. So the first check is there to make sure is_zonal_bucket method is found in the class.

Copy link
Owner

@ankitaluthra1 ankitaluthra1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/gcbrun

@suni72 suni72 force-pushed the zb-write-copy branch 2 times, most recently from 816e73b to 95dd97b Compare December 4, 2025 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants