[8.7.0] Cherry-pick remote repo contents cache feature#28997
Open
fmeum wants to merge 23 commits intobazelbuild:release-8.7.0from
Open
[8.7.0] Cherry-pick remote repo contents cache feature#28997fmeum wants to merge 23 commits intobazelbuild:release-8.7.0from
fmeum wants to merge 23 commits intobazelbuild:release-8.7.0from
Conversation
…seful Non-functional changes only: remove Pair indirection in ExternalFilesHelper, extract getExternalRepoName() and getExternalDirectory() helpers, move addExternalFilesDependencies into ExternalFilesHelper, modernize switch expression in DirtinessCheckerUtils, formatting fixes. Does not include the functional behavior change of refetching repos on external modifications. (cherry picked from commit 5e3f0c8)
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
93f7825 to
8691de4
Compare
… inputs Ports the essential API changes from 41ccfef needed by later feature commits: - Add RepoRecordedInput.WithValue record with parse/toString/escape/unescape - Add overloaded isAnyValueOutdated(Environment, BlazeDirectories, List<WithValue>) - Remove Comparable<RepoRecordedInput> and COMPARATOR (replaced by order preservation) - Change TreeMap to LinkedHashMap in RepositoryDelegatorFunction for order preservation (cherry picked from commit 41ccfef)
…nv handling Ports the essential API changes from 01407ce needed by later feature commits: - Add EnvironmentVariableValue record type - Add RepoEnvironmentFunction with REPO_ENV + client env fallback - Register REPOSITORY_ENVIRONMENT_VARIABLE in SkyFunctions and SkyframeExecutor - Update EnvVar.getSkyKey() to use RepoEnvironmentFunction - Update EnvVar.isOutdated() to use EnvironmentVariableValue On 8.7.0, RepoEnvironmentFunction checks --repo_env first, then falls back to the client environment via ClientEnvironmentFunction, since the consolidated repo env computation from CommandEnvironment is not present. (cherry picked from commit 01407ce)
(cherry picked from commit 7b792b6)
(cherry picked from commit 9bb6c19)
Ports the essential changes from fe040a3: - Rename DigestWriter.ruleKey to predeclaredInputHash and make it package-private (needed by later feature commits) - Switch RepoRecordedInput.File, Dirents, DirTree, EnvVar types to implement Comparable and use ImmutableSortedMap - Add ImmutableSortedMap Gson type adapter - Update LockFileModuleExtension, RunnableExtension, and related types to use ImmutableSortedMap for recorded inputs Does NOT include the change to fold environ values into the predeclared input hash computation itself; that requires CommandEnvironment changes not present on 8.7.0. (cherry picked from commit fe040a3)
(cherry picked from commit e66fe55)
* Rename `RepoContentsCache` to `LocalRepoContentsCache` * Generalize `RemoteRepositoryRemoteExecutorFactory` to `RemoteRepositoryHelperFactory` Work towards bazelbuild#6359 Closes bazelbuild#27311. PiperOrigin-RevId: 822553693 Change-Id: I1bad204340c06621cea806368d6bec99ca450a0f (cherry picked from commit 32be423)
0b7b9e5 to
2e832ec
Compare
(cherry picked from commit b8589c3)
…test (cherry picked from commit 0336a868183ebcf27e3d4f7fdfac8c9f8b5b3ad3)
I haven't been able to reproduce this in a test, but this should fix the following crash observed while running `bazel info`: ``` FATAL: bazel crashed due to an internal error. Printing stack trace: java.lang.NullPointerException: Cannot invoke "java.util.concurrent.ExecutorService.shutdownNow()" because "this.materializationExecutor" is null at com.google.devtools.build.lib.remote.RemoteExternalOverlayFileSystem.afterCommand(RemoteExternalOverlayFileSystem.java:145) at com.google.devtools.build.lib.remote.RemoteModule.afterCommand(RemoteModule.java:1034) at com.google.devtools.build.lib.runtime.BlazeRuntime.afterCommand(BlazeRuntime.java:787) at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:807) at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:266) at com.google.devtools.build.lib.server.GrpcServerImpl.executeCommand(GrpcServerImpl.java:608) at com.google.devtools.build.lib.server.GrpcServerImpl.lambda$run$0(GrpcServerImpl.java:679) at io.grpc.Context$1.run(Context.java:566) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) ``` Closes bazelbuild#27690. PiperOrigin-RevId: 833722608 Change-Id: I88c485a01e5967657ec3b5529a47639b743b18e6 (cherry picked from commit a7d0e91)
Don't print a message when it's successful. Users can always look under `external` to verify which repo came from the cache. Closes bazelbuild#27699. PiperOrigin-RevId: 834096735 Change-Id: I3916fb240218a6b68ecf48417142b998ca281598 (cherry picked from commit 3ca9ce1)
Fixes the creation of empty directories and also contains a speculative fix for the following issue observed during a sequence of real builds: ``` Error in path: Failed to materialize remote repo @@protoc-gen-validate+: [unix_jni.cc:302] /home/ubuntu/.cache/bazel/_bazel_ubuntu/123/external/protoc-gen-validate+/example-workspace/.bazelrc (File exists) ERROR: //:foo :: Error loading option //:foo: error evaluating module extension @@gazelle+//:extensions.bzl%go_deps ``` The mentioned file is a symlink. Closes bazelbuild#27711. PiperOrigin-RevId: 836122472 Change-Id: I8becd8c3640a659d28dc433340db962c18563d9f (cherry picked from commit b27ea05)
Ensures that the returned `Path` is still in the overlay file system.
Also make the error message emitted by `Path#checkSameFileSystem` more informative. This is motivated by and helped discover the above as the fix for the following crash observed when using the remote repo contents cache with an explicit `--sandbox_base`:
```
Caused by: java.lang.IllegalArgumentException: Files are on different filesystems: /dev/shm/bazel-sandbox.b10976335efa519b0184f3091ac8e21f7beefb92142303f9ab2c3341f45a2f28/linux-sandbox/18/execroot/_main/external/c-ares+/configs/ares_build.h (on com.google.devtools.build.lib.unix.UnixFileSystem@5e0a8154), /home/ubuntu/.cache/bazel/_bazel_ubuntu/123/execroot/_main/external/c-ares+/configs/ares_build.h (on com.google.devtools.build.lib.remote.RemoteExternalOverlayFileSystem@6cd9bfda)
at com.google.devtools.build.lib.vfs.Path.checkSameFileSystem(Path.java:964)
at com.google.devtools.build.lib.vfs.Path.createSymbolicLink(Path.java:523)
at com.google.devtools.build.lib.vfs.Path.createSymbolicLink(Path.java:535)
at com.google.devtools.build.lib.sandbox.SymlinkedSandboxedSpawn.copyFile(SymlinkedSandboxedSpawn.java:129)
```
Alternative to bazelbuild#27721
Closes bazelbuild#27802.
PiperOrigin-RevId: 837832265
Change-Id: I3b73167496b011aef66954d59ca3804b4b64996f
(cherry picked from commit 8eaf6a9)
Fixes bazelbuild#27981 Fixes the following type of crash and, incidentally, a remote repo contents cache test that resulted in a related crash: ``` FATAL: bazel crashed due to an internal error. Printing stack trace: java.lang.IllegalStateException: Unknown error during configuration creation evaluation at com.google.devtools.build.lib.skyframe.SkyframeExecutor.getConfiguration(SkyframeExecutor.java:2143) at com.google.devtools.build.lib.skyframe.SkyframeExecutor.createConfiguration(SkyframeExecutor.java:1876) at com.google.devtools.build.lib.analysis.BuildView.update(BuildView.java:281) at com.google.devtools.build.lib.buildtool.AnalysisPhaseRunner.runAnalysisPhase(AnalysisPhaseRunner.java:399) at com.google.devtools.build.lib.buildtool.AnalysisPhaseRunner.execute(AnalysisPhaseRunner.java:144) at com.google.devtools.build.lib.buildtool.BuildTool.buildTargetsWithoutMergedAnalysisExecution(BuildTool.java:512) at com.google.devtools.build.lib.buildtool.BuildTool.buildTargets(BuildTool.java:414) at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:907) at com.google.devtools.build.lib.runtime.commands.CqueryCommand.exec(CqueryCommand.java:197) at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:783) at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:266) at com.google.devtools.build.lib.server.GrpcServerImpl.executeCommand(GrpcServerImpl.java:608) at com.google.devtools.build.lib.server.GrpcServerImpl.lambda$run$0(GrpcServerImpl.java:679) at io.grpc.Context$1.run(Context.java:566) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Caused by: com.google.devtools.build.lib.skyframe.toolchains.PlatformLookupUtil$InvalidPlatformException: com.google.devtools.build.lib.packages.BuildFileNotFoundException: no such package '@@[unknown repo 'toolchains_llvm_boostrapped' requested from @@ (did you mean 'toolchains_llvm_bootstrapped'?)]//platforms': The repository '@@[unknown repo 'toolchains_llvm_boostrapped' requested from @@ (did you mean 'toolchains_llvm_bootstrapped'?)]' could not be resolved: No repository visible as '@toolchains_llvm_boostrapped' from main repository at com.google.devtools.build.lib.analysis.platform.PlatformFunction.compute(PlatformFunction.java:75) at com.google.devtools.build.lib.analysis.platform.PlatformFunction.compute(PlatformFunction.java:43) at com.google.devtools.build.skyframe.ParallelEvaluator.bubbleErrorUp(ParallelEvaluator.java:414) at com.google.devtools.build.skyframe.ParallelEvaluator.waitForCompletionAndConstructResult(ParallelEvaluator.java:207) at com.google.devtools.build.skyframe.ParallelEvaluator.doMutatingEvaluation(ParallelEvaluator.java:173) at com.google.devtools.build.skyframe.ParallelEvaluator.eval(ParallelEvaluator.java:672) at com.google.devtools.build.skyframe.AbstractInMemoryMemoizingEvaluator.evaluate(AbstractInMemoryMemoizingEvaluator.java:182) at com.google.devtools.build.lib.skyframe.SkyframeExecutor.evaluate(SkyframeExecutor.java:4279) at com.google.devtools.build.lib.skyframe.SkyframeExecutor.lambda$evaluateSkyKeys$0(SkyframeExecutor.java:2278) at com.google.devtools.build.lib.concurrent.Uninterruptibles.callUninterruptibly(Uninterruptibles.java:35) at com.google.devtools.build.lib.skyframe.SkyframeExecutor.evaluateSkyKeys(SkyframeExecutor.java:2274) at com.google.devtools.build.lib.skyframe.SkyframeExecutor.getConfiguration(SkyframeExecutor.java:2126) ... 16 more ``` Closes bazelbuild#28004. PiperOrigin-RevId: 845941915 Change-Id: I6ead8dd1662efe90f529a6e21041a225882415dc (cherry picked from commit d6dc631)
2e832ec to
40a98b0
Compare
`.bzl` files are typically small, but can form deep DAGs that require a large number of sequential cache requests to fetch lazily. By prefetching them (as well as `REPO.bazel` files) eagerly, the wall time of one particular fully cached cold `--nobuild` build of Bazel itself decreased by a factor of 5. Along the way, make remote repo contents cache failures non-fatal, matching the behavior of the remote cache. Closes bazelbuild#27910. PiperOrigin-RevId: 853153815 Change-Id: I368a14a845a8d9fb543f473d8c0c2178a4590c78 (cherry picked from commit 361c420)
…erbose_failures` Makes it easier to debug issues with this experimental feature and also matches the behavior of remote execution/caching. Work towards bazelbuild#27965 Closes bazelbuild#27970. PiperOrigin-RevId: 853238791 Change-Id: Id46ccbb105d93fd17114fab13b086d0b46139fb4 (cherry picked from commit fc5f160)
Ensures that files under repo contents cache entries are not reported as missing after the cache has been deleted while the Bazel server is running. See the long comment in `RepositoryFetchFunction` for why this happens and how it is fixed. Fixes bazelbuild#26450 Closes bazelbuild#28147. PiperOrigin-RevId: 853622194 Change-Id: Ifba953b72258030e0a640ac49947ac5c5fc7620a (cherry picked from commit 7019132)
* Also upload to the remote cache when the local cache is in use. The fix is simple but subtle: the logic for the two caches in `RepositoryFetchFunction` has to be flipped since the Skyframe restart after adding an entry to the local cache meant that the same code path would not be taken again. * Fix a crash when using both by ensuring that the local repo contents cache uses the file system backing the output base, not the workspace directory: ``` FATAL: bazel crashed due to an internal error. Printing stack trace: java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@@rules_python+' (requested by nodes 'REPO_FILE:@@rules_python+') at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:552) at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:435) at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source) at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) Caused by: java.lang.IllegalArgumentException: Files are on different filesystems: C:/users/runneradmin/_bazel_runneradmin/ebfu7cpi/external/@rules_python+.marker (on com.google.devtools.build.lib.remote.RemoteExternalOverlayFileSystem@79583b9), C:/Users/runneradmin/.cache/bazel-repo/contents/_trash/26a5feef-bf8c-4326-bf3d-500997c7362e (on com.google.devtools.build.lib.windows.WindowsFileSystem@24180f0f) at com.google.devtools.build.lib.vfs.Path.checkSameFileSystem(Path.java:964) at com.google.devtools.build.lib.vfs.Path.renameTo(Path.java:630) at com.google.devtools.build.lib.vfs.FileSystemUtils.moveFile(FileSystemUtils.java:456) at com.google.devtools.build.lib.bazel.repository.cache.LocalRepoContentsCache.moveToCache(LocalRepoContentsCache.java:172) at com.google.devtools.build.lib.bazel.repository.RepositoryFetchFunction.compute(RepositoryFetchFunction.java:297) at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:471) ``` Closes bazelbuild#28002. PiperOrigin-RevId: 855211557 Change-Id: I2f3c40a6aef594682fba989853f7ee982f30c294 (cherry picked from commit b143070)
…eValues Since this behavior is quite surprising (it definitely was to the author), this change also improves the test coverage for repo contents cache deletion by asserting that non-BUILD files within it actually exist on disk rather than just exist from the point of Skyframe. Also fix a crash observed while working on the test improvements. Closes bazelbuild#28222. PiperOrigin-RevId: 855225639 Change-Id: Ie4a88e93d14a4f4b7bb5217fc924e998a1779ccd (cherry picked from commit 4839f46)
Fixes bazelbuild#27517 by checking Skyframe deps in batches that stop right before any dep that may cause a cycle if checked while previous deps are out-of-date. This is accompanied by a restructuring of `RepoRecordedInput` that consolidates all Skyframe logic associated with the computation of the corresponding value exclusively within that class. This will also be helpful in adding support for dynamic inputs to the remote repo contents cache in future work. Also made the entirety of `RepositoryFetchFunction` use skyframe workers, so that checking the up-to-dateness of local repo contents cache entries isn't quadratic. Closes bazelbuild#28206. Co-authored-by: Xudong Yang <wyverald@gmail.com> PiperOrigin-RevId: 855252657 Change-Id: Ica18760ae79da5155fc0f3d8cd4f24c52a034c86 (cherry picked from commit 72a25a9) (cherry picked from commit 72a25a9)
* The cache was always written to, even if not enabled. * Google RBE doesn't accept `Command`s without the (deprecated) `Platform` field set. We set it both on `Command` and `Action`, just to be safe. Fixes bazelbuild#28294 (comment) Closes bazelbuild#28295. PiperOrigin-RevId: 856169835 Change-Id: I2479119a173e325a7d39643a36536569f5f831fc (cherry picked from commit a9946096847e22de98e0e11b1f5dfbb6ec6ecdbb)
…elbuild#28308) Important outputs and runfiles from external repos that are remote repo contents cache hits got stuck at various levels of the materialization pipeline for being source artifacts. This is fixed by consolidating the skip logic in a `RemoteOutputChecker` static helper. Closes bazelbuild#28308. PiperOrigin-RevId: 881618604 Change-Id: Ifaae8e39b0bcab3803653ca82bcf00d26c487316 (cherry picked from commit 16613f1)
40a98b0 to
4daa6c5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Cherry-picks the remote repo contents cache feature from
mastertorelease-8.7.0.This consists of:
The remote repo contents cache allows caching external repository contents in a remote cache (HTTP/gRPC), served via an in-memory overlay filesystem (
RemoteExternalOverlayFileSystem). This builds on the local repo contents cache already present onrelease-8.7.0.Prerequisite commits (manually ported)
5e3f0c8373— Refactoring from: Make external repo file checking actually useful41ccfefb88— Preserve order of recorded inputs (ImmutableMap→ImmutableSortedMap,Comparableon subclasses)01407ce758— Fix and consolidate repo env handling (RepoEnvironmentFunction,EnvironmentVariableValue)fe040a3271— Foldenvironinto the predeclared inputs hashFeature commits (cherry-picked with
-x)RemoteExternalOverlayFileSystem#resolveSymbolicLinksgetConfiguration.bzlfiles in the remote repo contents cache--verbose_failuresFileValuestalenessAdaptation notes
Key structural differences from
master:release-8.7.0,RepositoryDelegatorFunction.java+StarlarkRepositoryFunction.javaexist separately; onmasterthey were merged intoRepositoryFetchFunction.javaDigestWriteris an inner class ofRepositoryDelegatorFunctiononrelease-8.7.0; it's a separate file onmasterRepoEnvironmentFunctiononrelease-8.7.0checksPrecomputedValue.REPO_ENVfirst, then falls back toClientEnvironmentFunction(vsmaster's consolidated approachTest plan
🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
EOF
)