Skip to content

GCS client library migration in Java SDK - part 2b#37592

Open
shunping wants to merge 10 commits intoapache:masterfrom
shunping:gcs-migration-2b
Open

GCS client library migration in Java SDK - part 2b#37592
shunping wants to merge 10 commits intoapache:masterfrom
shunping:gcs-migration-2b

Conversation

@shunping
Copy link
Collaborator

@shunping shunping commented Feb 13, 2026

This is a follow-up to #37502.

  • It includes the GcsUtil V2 for copy(), remove() and rename().

Notice that:

  • There could be performance degradation when using the new copy() and rename() apis because GCS client library does not support batch rewrite/copy/move operations.
  • Everything is guarded with an feature flag which is disabled by default.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @shunping, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request continues the migration of the Java SDK's GCS client library, focusing on core file manipulation operations. It introduces new V2 methods for copying, renaming, and removing files, providing more granular control over behavior in scenarios involving missing source files or existing destination files. This change aims to improve the robustness and flexibility of GCS file operations within the SDK by leveraging the updated client library capabilities.

Highlights

  • GCS Client Library Migration: Introduced new copyV2, renameV2, and removeV2 methods in GcsUtil to leverage the GcsUtilV2 implementation for file operations.
  • Enhanced File Operations in GcsUtilV2: Implemented robust copy, move, and remove operations in GcsUtilV2 using the Google Cloud Storage client library, supporting various MissingStrategy and OverwriteStrategy options for finer control over behavior.
  • Comprehensive Integration Tests: Added extensive integration tests for copy, remove, and rename operations in GcsUtilParameterizedIT to validate both V1 and V2 behaviors, including different strategies and error handling.
  • GcsPath Wildcard Handling: Enhanced GcsPathTest with additional test cases for wildcard paths, specifically addressing root-level wildcards.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java
    • Added imports for Set, StorageObjectOrIOException, MissingStrategy, OverwriteStrategy, StandardMoveOptions, and Sets.
    • Introduced copyV2 and copy methods that accept GcsPath and OverwriteStrategy.
    • Introduced renameV2 and rename methods that accept GcsPath, MissingStrategy, and OverwriteStrategy.
    • Introduced removeV2 and remove methods that accept GcsPath and MissingStrategy.
    • These new methods delegate their functionality to GcsUtilV2.
  • sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtilV2.java
    • Added checkArgument import.
    • Added imports for BlobId, BlobInfo, CopyWriter, and CopyRequest.
    • Defined MEGABYTES_COPIED_PER_CHUNK constant to control copy chunk size.
    • Updated translateStorageException to handle FileNotFoundException for 404 errors.
    • Introduced MissingStrategy and OverwriteStrategy enums to define behavior for file operations.
    • Implemented the remove method using batch deletion with MissingStrategy.
    • Implemented rewriteHelper to encapsulate common logic for copy and move operations, incorporating MissingStrategy and OverwriteStrategy.
    • Implemented the copy method by leveraging the rewriteHelper.
    • Implemented the move method by leveraging the rewriteHelper.
  • sdks/java/extensions/google-cloud-platform-core/src/test/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtilParameterizedIT.java
    • Added imports for MissingStrategy, OverwriteStrategy, and MoveOptions.
    • Introduced createTestBucketHelper and tearDownTestBucketHelper for managing test bucket setup and cleanup.
    • Added testCopy method with extensive tests for both V1 and V2 copy operations, covering various OverwriteStrategy scenarios and expected error handling.
    • Added testRemove method with comprehensive tests for both V1 and V2 remove operations, including MissingStrategy scenarios and error handling.
    • Added testRename method with detailed tests for both V1 and V2 rename operations, covering MoveOptions and error handling, and highlighting a V1 bug.
    • Added assertExists and assertNotExists helper methods for asserting the presence or absence of GCS paths.
  • sdks/java/extensions/google-cloud-platform-core/src/test/java/org/apache/beam/sdk/extensions/gcp/util/gcsfs/GcsPathTest.java
    • Updated testIsWildcard to include a test case for 'gs://bucket/*' wildcard pattern.
    • Updated testGetNonWildcardPrefix to include a test case for 'gs://bucket/*' prefix extraction.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@shunping shunping marked this pull request as ready for review February 13, 2026 16:20
@github-actions
Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 40.06%. Comparing base (9ea7380) to head (5251ef0).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #37592      +/-   ##
============================================
+ Coverage     35.88%   40.06%   +4.18%     
- Complexity     1691     3416    +1725     
============================================
  Files          1063     1178     +115     
  Lines        166668   187225   +20557     
  Branches       1227     3589    +2362     
============================================
+ Hits          59804    75014   +15210     
- Misses       104665   108818    +4153     
- Partials       2199     3393    +1194     
Flag Coverage Δ
java 71.94% <ø> (+4.58%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@shunping
Copy link
Collaborator Author

r: @Abacn @damccorm

@shunping shunping requested review from Abacn and damccorm and removed request for damccorm February 13, 2026 22:02
@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant