-
Notifications
You must be signed in to change notification settings - Fork 260
chore: add RangeValidatingReadableByteChannel #1599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: add RangeValidatingReadableByteChannel #1599
Conversation
Define a ReadableByteChannel that can validate limits relative to the delegate channel provided to it. If an over-read happens a warning message will be logged. If an under-read happens an IOException will be thrown. Add unit tests to validate behavior using copying and direct read operations.
Summary of ChangesHello @BenWhitehead, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the robustness of byte channel operations by introducing a Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a RangeValidatingReadableByteChannel to encapsulate read validation logic, which is a great improvement for modularity. The GoogleCloudStorageClientReadChannel is refactored to use this new component, simplifying its implementation. The new class is well-tested, covering over-read and under-read scenarios. My feedback includes a couple of suggestions to improve code clarity and simplify logic.
| private BlobSourceOption[] generateReadOptions(StorageResourceId blobId) { | ||
| List<BlobSourceOption> blobReadOptions = new ArrayList<>(); | ||
| // enable transparent gzip-decompression | ||
| blobReadOptions.add(BlobSourceOption.shouldReturnRawInputStream(false)); | ||
|
|
||
| if (blobId.getGenerationId() > StorageResourceId.UNKNOWN_GENERATION_ID) { | ||
| blobReadOptions.add(BlobSourceOption.generationMatch(blobId.getGenerationId())); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter blobId is of type StorageResourceId, which could be confusing since there is also a com.google.cloud.storage.BlobId class used in this file. To improve clarity and avoid potential confusion, consider renaming it to resourceId to match its type and the variable name used at the call site in the constructor.
| private BlobSourceOption[] generateReadOptions(StorageResourceId blobId) { | |
| List<BlobSourceOption> blobReadOptions = new ArrayList<>(); | |
| // enable transparent gzip-decompression | |
| blobReadOptions.add(BlobSourceOption.shouldReturnRawInputStream(false)); | |
| if (blobId.getGenerationId() > StorageResourceId.UNKNOWN_GENERATION_ID) { | |
| blobReadOptions.add(BlobSourceOption.generationMatch(blobId.getGenerationId())); | |
| } | |
| private BlobSourceOption[] generateReadOptions(StorageResourceId resourceId) { | |
| List<BlobSourceOption> blobReadOptions = new ArrayList<>(); | |
| // enable transparent gzip-decompression | |
| blobReadOptions.add(BlobSourceOption.shouldReturnRawInputStream(false)); | |
| if (resourceId.getGenerationId() > StorageResourceId.UNKNOWN_GENERATION_ID) { | |
| blobReadOptions.add(BlobSourceOption.generationMatch(resourceId.getGenerationId())); | |
| } |
| if (expectedChannelRemaining < dst.remaining()) { | ||
| expectedMaxRead = Math.toIntExact(expectedChannelRemaining); | ||
| } else { | ||
| expectedMaxRead = dst.remaining(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calculation for expectedMaxRead can be simplified and made more robust by ensuring it's never negative. Using Math.max(0, ...) will handle cases where position > endOffset cleanly, and Math.min can combine the logic into a single line.
expectedMaxRead = (int) Math.min(dst.remaining(), Math.max(0, expectedChannelRemaining));There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dst.remaining will always return a value >= 0. This is already taken care of gemini. Also, blindly casing a long to an int can cause overflow, which is why Math.toIntExact is used instead.
| position += read; | ||
| if (read > expectedMaxRead) { | ||
| // over-read | ||
| logger.atWarning().log( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should close theRangeValidatingReadableByteChannel in GoogleCloudStorageClientReadChannel.java when overshoot happens to avoid query failures.
| int read = delegate.read(dst); | ||
| if (read > -1) { | ||
| position += read; | ||
| if (read > expectedMaxRead) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the logs received in the bug, buffer had some remaining space but channel end was crossed. So expectedMaxRead will always be greater than or equal to read and unlikely to log what is happening in the bug. I think we should log over shoot when position > endOffset to find out what is happening in the actual bug.
Define a ReadableByteChannel that can validate limits relative to the delegate channel provided to it.
If an over-read happens a warning message will be logged. If an under-read happens an IOException will be thrown.
Add unit tests to validate behavior using copying and direct read operations.