- 
                Notifications
    You must be signed in to change notification settings 
- Fork 0
Check that digesting consumes the expected number of bytes. #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
In Bazel, the time a file is `stat`ed to find its size can be very far from the time the file's digest is computed, which creates a window for shenanigans. Allow passing the expected size into `Path.getDigest` and raise an error there if the number of digested bytes doesn't match the previously seen size.
| If file size differs, the digest must be different. Can you share more details about the motivation/background for this PR? | 
| The potential problem is Bazel doesn't compute the file size by counting how many bytes are fed into the hash function. Rather, the file is  | 
| In other words, I'm trying to add a best-effort consistency check. | 
| Internally, we embed the size into the digest (i.e. the file size can be derived from the digest) so I can understand the change. But I am not sure whether missing the file size could be a problem in practice. Considering lots of effect will be needed to import this PR, I will let @meisterT make the decision. | 
| If we decide to add this, we should consider merging it with  | 
| Steps can likely to taken to make this easier to merge if needed. Note I added a  With regards to  | 
| 
 Alternatively, would there be a way to align the hash functions available in Bazel with this construction? | 
| WalkthroughThis change introduces file size awareness to digest computation throughout the codebase. The primary update is the addition of an  Changes
 Sequence Diagram(s)sequenceDiagram
    participant Caller
    participant Path
    participant FileSystem
    participant DigestUtils
    Caller->>Path: getDigest(expectedSize)
    Path->>FileSystem: getDigest(pathFragment, expectedSize)
    FileSystem->>DigestUtils: (if fallback needed) getDigestWithManualFallback(path, expectedSize, xattrProvider)
    DigestUtils->>FileSystem: manuallyComputeDigest(path, expectedSize)
    FileSystem-->>DigestUtils: digest bytes (after verifying size)
    DigestUtils-->>FileSystem: digest bytes
    FileSystem-->>Path: digest bytes
    Path-->>Caller: digest bytes
Poem
 Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.31.1)src/test/java/com/google/devtools/build/lib/remote/RemoteExecutionServiceTest.java📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (26)
 🧰 Additional context used🧬 Code Graph Analysis (6)src/main/java/com/google/devtools/build/lib/exec/SpawnLogContext.java (1)
 src/main/java/com/google/devtools/build/lib/skyframe/ActionOutputMetadataStore.java (1)
 src/main/java/com/google/devtools/build/lib/remote/util/DigestUtil.java (1)
 src/main/java/com/google/devtools/build/lib/exec/RunfilesTreeUpdater.java (1)
 src/main/java/com/google/devtools/build/lib/actions/FileArtifactValue.java (1)
 src/main/java/com/google/devtools/build/lib/vfs/Path.java (1)
 🔇 Additional comments (60)
 ✨ Finishing Touches
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit: 
 Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
 Other keywords and placeholders
 CodeRabbit Configuration File ( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @arvi18, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request introduces a mechanism to verify that the number of bytes digested from a file matches the expected file size. This is done to prevent potential security issues where the file's content changes between the time its size is determined and the time its digest is computed. The changes involve modifying the getDigest methods in FileSystem and related classes to accept an expectedSize parameter, and throwing an IOException if the number of digested bytes doesn't match the expected size. Additionally, the DigestUtils class is updated to include methods that utilize this new size-checking mechanism.
Highlights
- Digest Verification: Introduces a check to ensure the number of bytes digested matches the expected file size.
- Modified getDigest Methods: Updates getDigestmethods inFileSystemand related classes to accept anexpectedSizeparameter.
- DigestUtils Updates: Adds new methods to DigestUtilsto utilize the size-checking mechanism during digest computation.
- Exception Handling: Throws an IOExceptionif the digested bytes don't match the expected size, preventing potential security issues.
Changelog
Click here to see the changelog
- src/main/java/com/google/devtools/build/lib/actions/FileArtifactValue.java
- Modified line 276 to pass the file size to DigestUtils.getDigestWithManualFallback.
 
- Modified line 276 to pass the file size to 
- src/main/java/com/google/devtools/build/lib/analysis/ConfiguredRuleClassProvider.java
- Modified line 129 to pass -1 as the expected size to getDigest.
- Modified line 133 to add expectedSizeparameter togetDigest.
 
- Modified line 129 to pass -1 as the expected size to 
- src/main/java/com/google/devtools/build/lib/exec/RunfilesTreeUpdater.java
- Modified line 141 to use DigestUtils.getDigestWithManualFallbackWhenSizeUnknown.
 
- Modified line 141 to use 
- src/main/java/com/google/devtools/build/lib/exec/SpawnLogContext.java
- Modified line 181 to pass the file size to DigestUtils.getDigestWithManualFallback.
 
- Modified line 181 to pass the file size to 
- src/main/java/com/google/devtools/build/lib/remote/RemoteActionFileSystem.java
- Modified line 437 to add expectedSizeparameter togetDigest.
- Modified line 446 to pass expectedSizetolocalFs.getPath(path).getDigest().
 
- Modified line 437 to add 
- src/main/java/com/google/devtools/build/lib/remote/util/DigestUtil.java
- Modified line 75 to pass the file size to DigestUtils.getDigestWithManualFallback.
 
- Modified line 75 to pass the file size to 
- src/main/java/com/google/devtools/build/lib/skyframe/ActionOutputMetadataStore.java
- Modified line 560 to pass the file size to DigestUtils.manuallyComputeDigest.
 
- Modified line 560 to pass the file size to 
- src/main/java/com/google/devtools/build/lib/testing/vfs/SpiedFileSystem.java
- Modified line 61 to add expectedSizeparameter togetDigest.
 
- Modified line 61 to add 
- src/main/java/com/google/devtools/build/lib/unix/UnixFileSystem.java
- Modified line 449 to add expectedSizeparameter togetDigest.
- Modified line 453 to pass expectedSizetosuper.getDigest.
 
- Modified line 449 to add 
- src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java
- Modified line 167 to add fileSizeparameter togetDigestWithManualFallback.
- Added getDigestWithManualFallbackWhenSizeUnknownmethod on line 183.
- Modified line 196 to add fileSizeparameter tomanuallyComputeDigest.
- Modified line 210 to pass fileSizetopath.getDigest.
- Modified line 212 to include fileSizein the Preconditions check.
 
- Modified line 167 to add 
- src/main/java/com/google/devtools/build/lib/vfs/FileSystem.java
- Added import for com.google.common.hash.Funnelson line 23.
- Added import for java.util.Localeon line 37.
- Modified line 352 to add expectedSizeparameter togetDigest.
- Modified line 361 to use Funnels.asOutputStream.
- Modified line 362 to check if expectedSizematches the number of bytes copied.
- Modified line 363 to throw an IOExceptionif the sizes don't match.
 
- Added import for 
- src/main/java/com/google/devtools/build/lib/vfs/JavaIoFileSystem.java
- Modified line 423 to add expectedSizeparameter togetDigest.
- Modified line 427 to pass expectedSizetosuper.getDigest.
 
- Modified line 423 to add 
- src/main/java/com/google/devtools/build/lib/vfs/Path.java
- Modified line 688 to call getDigest(-1).
- Added getDigest(long expectedSize)method on line 704.
- Modified line 735 to pass stat.getSize()toDigestUtils.getDigestWithManualFallback.
- Modified line 751 to use DigestUtils.getDigestWithManualFallbackWhenSizeUnknown.
 
- Modified line 688 to call 
- src/main/java/com/google/devtools/build/lib/vfs/PathTransformingDelegateFileSystem.java
- Modified line 230 to add expectedSizeparameter togetDigest.
- Modified line 231 to pass expectedSizetodelegateFs.getDigest.
 
- Modified line 230 to add 
- src/test/java/com/google/devtools/build/lib/analysis/StubbableFSBuildViewTest.java
- Modified line 140 to pass -1 as the expected size to getDigest.
 
- Modified line 140 to pass -1 as the expected size to 
- src/test/java/com/google/devtools/build/lib/exec/SingleBuildFileCacheTest.java
- Modified line 64 to add expectedSizeparameter togetDigest.
- Modified line 66 to pass expectedSizetosuper.getDigest.
 
- Modified line 64 to add 
- src/test/java/com/google/devtools/build/lib/exec/SpawnLogContextTestBase.java
- Modified line 89 to pass -1 as the expected size to super.getDigest.
- Modified line 93 to add expectedSizeparameter togetDigest.
 
- Modified line 89 to pass -1 as the expected size to 
- src/test/java/com/google/devtools/build/lib/remote/RemoteActionFileSystemTest.java
- Added import for org.mockito.ArgumentMatchers.anyLongon line 24.
- Modified line 582 to verify that getDigestis never called with any long.
- Modified line 584 to pass -1 as the expected size to getDigest.
- Modified line 597 to verify that getDigestis never called with any long.
- Modified line 599 to pass -1 as the expected size to getDigest.
- Modified line 610 to pass -1 as the expected size to getDigest.
- Modified line 621 to pass -1 as the expected size to getDigest.
- Modified line 631 to pass -1 as the expected size to getDigest.
- Modified line 654 to pass -1 as the expected size to getDigest.
 
- Added import for 
- src/test/java/com/google/devtools/build/lib/remote/RemoteExecutionServiceTest.java
- Modified line 388 to pass -1 as the expected size to getDigest.
- Modified line 425 to pass -1 as the expected size to getDigest.
- Modified line 428 to pass -1 as the expected size to getDigest.
- Modified line 1154 to pass -1 as the expected size to getDigest.
- Modified line 1156 to pass -1 as the expected size to getDigest.
- Modified line 1188 to pass -1 as the expected size to getDigest.
- Modified line 1190 to pass -1 as the expected size to getDigest.
- Modified line 1238 to pass -1 as the expected size to getDigest.
- Modified line 1240 to pass -1 as the expected size to getDigest.
- Modified line 1287 to pass -1 as the expected size to getDigest.
- Modified line 1289 to pass -1 as the expected size to getDigest.
- Modified line 1370 to pass -1 as the expected size to getDigest.
- Modified line 1372 to pass -1 as the expected size to getDigest.
- Modified line 1406 to pass -1 as the expected size to getDigest.
- Modified line 1408 to pass -1 as the expected size to getDigest.
- Modified line 1451 to pass -1 as the expected size to getDigest.
- Modified line 1453 to pass -1 as the expected size to getDigest.
- Modified line 1562 to pass -1 as the expected size to getDigest.
- Modified line 1565 to pass -1 as the expected size to getDigest.
 
- Modified line 388 to pass -1 as the expected size to 
- src/test/java/com/google/devtools/build/lib/skyframe/ArtifactFunctionTest.java
- Modified line 122 to add expectedSizeparameter togetDigest.
 
- Modified line 122 to add 
- src/test/java/com/google/devtools/build/lib/skyframe/ArtifactFunctionTestCase.java
- Modified line 183 to pass -1 as the expected size to getDigest.
 
- Modified line 183 to pass -1 as the expected size to 
- src/test/java/com/google/devtools/build/lib/skyframe/FileArtifactValueTest.java
- Modified line 189 to add expectedSizeparameter togetDigest.
 
- Modified line 189 to add 
- src/test/java/com/google/devtools/build/lib/skyframe/FileFunctionTest.java
- Modified line 898 to add expectedSizeparameter togetDigest.
- Modified line 900 to pass expectedSizetosuper.getDigest.
- Modified line 1819 to pass -1 as the expected size to getDigest.
 
- Modified line 898 to add 
- src/test/java/com/google/devtools/build/lib/unix/UnixDigestHashAttributeNameTest.java
- Modified line 46 to pass a file size of 123 to DigestUtils.getDigestWithManualFallback.
 
- Modified line 46 to pass a file size of 123 to 
- src/test/java/com/google/devtools/build/lib/vfs/DigestUtilsTest.java
- Modified line 52 to add expectedSizeparameter togetDigest.
- Modified line 62 to pass expectedSizetosuper.getDigest.
- Modified line 67 to pass -1 as the expected size to super.getDigest.
- Modified line 79 to pass fileSize1toDigestUtils.getDigestWithManualFallback.
- Modified line 83 to pass fileSize2toDigestUtils.getDigestWithManualFallback.
- Modified line 112 to add expectedSizeparameter togetDigest.
- Modified line 114 to pass expectedSizetosuper.getDigest.
- Modified line 124 to pass file.getFileSize()toDigestUtils.getDigestWithManualFallback.
- Modified line 129 to pass file.getFileSize()toDigestUtils.getDigestWithManualFallback.
- Modified line 137 to pass file.getFileSize()toDigestUtils.getDigestWithManualFallback.
- Modified line 160 to pass a file size of 8 to DigestUtils.manuallyComputeDigest.
 
- Modified line 52 to add 
- src/test/java/com/google/devtools/build/lib/vfs/FileSystemTest.java
- Added assertions to test that xFile.getDigestthrows anIOExceptionwhen the expected size does not match the actual size.
 
- Added assertions to test that 
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description | 
|---|---|---|
| Code Review | /gemini review | Performs a code review for the current pull request in its current state. | 
| Pull Request Summary | /gemini summary | Provides a summary of the current pull request in its current state. | 
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. | 
| Help | /gemini help | Displays a list of available commands. | 
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A digest's tale,
Size known, no room to fail,
Integrity's keep.
Footnotes
- 
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩ 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces a mechanism to verify that the number of bytes digested matches the expected size, enhancing the integrity of file digests in Bazel. This is a valuable addition to prevent potential shenanigans due to timing discrepancies. The changes are well-structured and address the core issue effectively.
Summary of Findings
- Method Naming Consistency: The introduction of getDigestWithManualFallbackWhenSizeUnknownalongsidegetDigestWithManualFallbackraises a question about naming consistency. Consider whether a single method with a default value for thefileSizeparameter would be more maintainable.
- Exception Message Clarity: The exception message in FileSystem.getDigestcould be improved by including the file size unit (e.g., bytes) for better clarity.
- Test Coverage: While the changes include modifications to several test files, it's important to ensure that the new functionality is adequately tested, especially the case where the digested bytes do not match the expected size.
Merge Readiness
The pull request is well-structured and addresses an important issue. However, before merging, it would be beneficial to address the naming consistency of the digest methods and enhance the clarity of the exception message. Also, ensure that the new functionality is adequately tested. I am unable to approve this pull request, and users should have others review and approve this code before merging. Given the medium severity issues, I recommend that the pull request not be merged until those are addressed.
| /** | ||
| * Gets the digest of {@code path}, using a constant-time xattr call if the filesystem supports | ||
| * it, and calculating the digest manually otherwise. | ||
| * | ||
| * <p>Unlike {@link #getDigestWithManualFallback}, will not rate-limit manual digesting of files, | ||
| * so only use this method if the file size is truly unknown and you don't expect many concurrent | ||
| * manual digests of large files. | ||
| * | ||
| * @param path Path of the file. | ||
| */ | ||
| public static byte[] getDigestWithManualFallbackWhenSizeUnknown( | ||
| Path path, XattrProvider xattrProvider) throws IOException { | ||
| return getDigestWithManualFallback(path, -1, xattrProvider); | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assertThrows(IOException.class, () -> xFile.getDigest(size + 1)); | ||
| assertThrows(IOException.class, () -> xFile.getDigest(size - 1)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @coderabbitai full review | 
| ✅ Actions performedFull review triggered. | 
| /do-refacto | 
| Refacto is reviewing this PR. Please wait for the review comments to be posted. | 
In Bazel, the time a file is
stated to find its size can be very far from the time the file's digest is computed, which creates a window for shenanigans. Allow passing the expected size intoPath.getDigestand raise an error there if the number of digested bytes doesn't match the previously seen size.Summary by CodeRabbit
New Features
Bug Fixes
Tests
Refactor