Skip to content

Commit 032f18e

Browse files
authored
Merge pull request #9 from bazel-ios/jackies/upgrade-bazel-buildfarm-to-v2.6.1
2 parents d80cf31 + ee77a49 commit 032f18e

File tree

84 files changed

+1355
-504
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

84 files changed

+1355
-504
lines changed

_site/docs/architecture/content_addressable_storage.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,9 @@ This is the example presentation of a CAS in the memory instance available [here
3838

3939
```
4040
worker:
41-
cas:
42-
type: MEMORY
43-
maxSizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024
41+
storages:
42+
- type: MEMORY
43+
maxSizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024
4444
```
4545

4646
## GRPC
@@ -53,9 +53,11 @@ A grpc config example is available in the alternate instance specification in th
5353
server:
5454
name: shard
5555
worker:
56-
cas:
57-
type: GRPC
58-
target:
56+
storages:
57+
- type: FILESYSTEM
58+
path: "cache"
59+
- type: GRPC
60+
target:
5961
```
6062

6163
## HTTP/1
@@ -89,11 +91,10 @@ The CASFileCache is also available on MemoryInstance servers, where it can repre
8991

9092
```
9193
worker:
92-
cas:
93-
type: FILESYSTEM
94-
path: "cache"
95-
maxSizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024
96-
maxEntrySizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024
94+
storages:
95+
- type: FILESYSTEM
96+
path: "cache"
97+
maxSizeBytes: 2147483648 # 2 * 1024 * 1024 * 1024
9798
```
9899

99100
CASTest is a standalone tool to load the cache and print status information about it.

_site/docs/configuration/configuration.md

Lines changed: 32 additions & 30 deletions
Large diffs are not rendered by default.

_site/docs/metrics/metrics.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,10 @@ Gauge for the number of operations in each stage (using a stage_name for each in
124124

125125
Gauge for the completed operations status (using a status_code label for each individual GRPC code)
126126

127+
**operation_exit_code**
128+
129+
Gauge for the completed operations exit code (using a exit_code label for each individual execution exit code)
130+
127131
**operation_worker**
128132

129133
Gauge for the number of operations executed on each worker (using a worker_name label for each individual worker)

_site/docs/quick_start.md

Lines changed: 32 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Here we describe how to use bazel remote caching or remote execution with buildf
1010

1111
## Setup
1212

13-
You can run this quick start on a single computer running nearly any flavor of linux. This computer is the localhost for the rest of the description.
13+
You can run this quick start on a single computer running any flavor of linux that bazel supports. A C++ compiler is used here to demonstrate action execution. This computer is the localhost for the rest of the description.
1414

1515
### Backplane
1616

@@ -44,33 +44,43 @@ cc_binary(
4444

4545
And an empty WORKSPACE file.
4646

47-
As a test, verify that `bazel run :main` builds your main program and runs it, and prints `Hello, World!`. This will ensure that you have properly installed bazel and a C++ compiler, and have a working target before moving on to remote execution.
47+
As a test, verify that `bazel run :main` builds your main program and runs it, and prints `Hello, World!`. This will ensure that you have properly installed `bazel` and a C++ compiler, and have a working target before moving on to remote caching or remote execution.
4848

4949
Download and extract the buildfarm repository. Each command sequence below will have the intended working directory indicated, between the client (workspace running bazel), and buildfarm.
5050

5151
This tutorial assumes that you have a bazel binary in your path and you are in the root of your buildfarm clone/release, and has been tested to work with bash on linux.
5252

5353
## Remote Caching
5454

55-
A Buildfarm server with an instance can be used strictly as an ActionCache and ContentAddressableStorage to improve build performance. This is an example of running a bazel client that will retrieve results if available, and store them if the cache is missed and the execution needs to run locally.
55+
A Buildfarm cluster can be used strictly as an ActionCache (AC) and ContentAddressableStorage (CAS) to improve build performance. This is an example of running a bazel client that will retrieve results if available, otherwise store them on a cache miss after executing locally.
5656

5757
Download the buildfarm repository and change into its directory, then:
5858

59-
run `bazelisk run src/main/java/build/buildfarm:buildfarm-server $PWD/examples/config.minimal.yml`
59+
* run `bazel run src/main/java/build/buildfarm:buildfarm-server $PWD/examples/config.minimal.yml`
6060

6161
This will wait while the server runs, indicating that it is ready for requests.
6262

63-
From another prompt (i.e. a separate terminal) in your newly created workspace directory from above:
63+
A server alone does not itself store the content of action results. It acts as an endpoint for any number of workers that present storage, so we must also start a single worker.
6464

65-
run `bazel clean`
66-
run `bazel run --remote_cache=grpc://localhost:8980 :main`
65+
From another prompt (i.e. a separate terminal) in the buildfarm repository directory:
66+
67+
* run `bazel run src/main/java/build/buildfarm:buildfarm-shard-worker -- --prometheus_port=9091 $PWD/examples/config.minimal.yml`
68+
69+
The `--` option is bazel convention to treat all subsequent arguments as parameters to the running app, like our `--prometheus_port`, instead of interpreting them with `run`
70+
The `--prometheus_port=9091` option allows this worker to run alongside our server, who will have started and logged that it has started a service on port `9090`. You can also turn this option off (with `--` separator), with `--prometheus_option=0` for either server or worker.
71+
This will also wait while the worker runs, indicating it will be available to store cache content.
72+
73+
From another prompt in your newly created workspace directory from above:
74+
75+
* run `bazel clean`
76+
* run `bazel run --remote_cache=grpc://localhost:8980 :main`
6777

6878
Why do we clean here? Since we're verifying re-execution and caching, this ensures that we will execute any actions in the `run` step and interact with the remote cache. We should be attempting to retrieve cached results, and then when we miss - since we just started this memory resident server - bazel will upload the results of the execution for later use. There will be no change in the output of this bazel run if everything worked, since bazel does not provide output each time it uploads results.
6979

7080
To prove that we have placed something in the action cache, we need to do the following:
7181

72-
run `bazel clean`
73-
run `bazel run --remote_cache=localhost:8980 :main`
82+
* run `bazel clean`
83+
* run `bazel run --remote_cache=localhost:8980 :main`
7484

7585
This should now print statistics on the `processes` line that indicate that you've retrieved results from the cache for your actions:
7686

@@ -80,20 +90,22 @@ INFO: 2 processes: 2 remote cache hit.
8090

8191
## Remote Execution (and caching)
8292

83-
Now we will use buildfarm for remote execution with a minimal configuration - a single memory instance, with a worker on the localhost that can execute a single process at a time - via a bazel invocation on our workspace.
93+
Now we will use buildfarm for remote execution with a minimal configuration with a worker on the localhost that can execute a single process at a time, via a bazel invocation on our workspace.
8494

85-
First, we should restart the buildfarm server to ensure that we get remote execution (this can also be forced from the client by using `--noremote_accept_cached`). From the buildfarm server prompt and directory:
95+
First, to clean out the results from the previous cached actions, flush your local redis database:
8696

87-
interrupt a running `buildfarm-server`
88-
run `bazelisk run src/main/java/build/buildfarm:buildfarm-server $PWD/examples/config.minimal.yml`
97+
* run `redis-cli flushdb`
8998

90-
From another prompt in the buildfarm repository directory:
99+
Next, we should restart the buildfarm server, and delete the worker's cas storage to ensure that we get remote execution (this can also be forced from the client by using `--noremote_accept_cached`). From the buildfarm server prompt and directory:
91100

92-
run `bazelisk run src/main/java/build/buildfarm:buildfarm-shard-worker $PWD/examples/config.minimal.yml`
101+
* interrupt the running `buildfarm-server` (i.e. Ctrl-C)
102+
* run `bazel run src/main/java/build/buildfarm:buildfarm-server $PWD/examples/config.minimal.yml`
103+
104+
You can leave the worker running from the Remote Caching step, it will not require a restart
93105

94106
From another prompt, in your client workspace:
95107

96-
run `bazel run --remote_executor=grpc://localhost:8980 :main`
108+
* run `bazel run --remote_executor=grpc://localhost:8980 :main`
97109

98110
Your build should now print out the following on its `processes` summary line:
99111

@@ -117,6 +129,10 @@ To stop the containers, run:
117129
./examples/bf-run stop
118130
```
119131

132+
## Next Steps
133+
134+
We've started our worker on the same host as our server, and also the same host on which we built with bazel, but these services can be spread across many machines, per 'remote'. A large number of workers, with a relatively small number of servers (10:1 and 100:1 ratios have been used in practice), consolidating large disks and beefy multicore cpus/gpus on workers, with specialization of what work they perform for bazel builds (or other client work), and specializing servers to have hefty network connections to funnel content traffic. A buildfarm deployment can service hundreds or thousands of developers or CI processes, enabling them to benefit from each others' shared context in the AC/CAS, and the pooled execution of a fleet of worker hosts eager to consume operations and deliver results.
135+
120136
## Buildfarm Manager
121137

122138
You can now easily launch a new Buildfarm cluster locally or in AWS using an open sourced [Buildfarm Manager](https://github.com/80degreeswest/bfmgr).

defs.bzl

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ def buildfarm_init(name = "buildfarm"):
9696
"com.google.errorprone:error_prone_annotations:2.9.0",
9797
"com.google.errorprone:error_prone_core:0.92",
9898
"com.google.guava:failureaccess:1.0.1",
99-
"com.google.guava:guava:31.1-jre",
99+
"com.google.guava:guava:32.1.1-jre",
100100
"com.google.j2objc:j2objc-annotations:1.1",
101101
"com.google.jimfs:jimfs:1.1",
102102
"com.google.protobuf:protobuf-java-util:3.10.0",
@@ -108,8 +108,8 @@ def buildfarm_init(name = "buildfarm"):
108108
"io.github.lognet:grpc-spring-boot-starter:4.5.4",
109109
"org.bouncycastle:bcprov-jdk15on:1.70",
110110
"net.jcip:jcip-annotations:1.0",
111-
] + ["io.netty:netty-%s:4.1.90.Final" % module for module in IO_NETTY_MODULES] +
112-
["io.grpc:grpc-%s:1.53.0" % module for module in IO_GRPC_MODULES] +
111+
] + ["io.netty:netty-%s:4.1.94.Final" % module for module in IO_NETTY_MODULES] +
112+
["io.grpc:grpc-%s:1.56.1" % module for module in IO_GRPC_MODULES] +
113113
[
114114
"io.prometheus:simpleclient:0.10.0",
115115
"io.prometheus:simpleclient_hotspot:0.10.0",
@@ -139,8 +139,8 @@ def buildfarm_init(name = "buildfarm"):
139139
],
140140
generate_compat_repositories = True,
141141
repositories = [
142-
"https://repo.maven.apache.org/maven2",
143-
"https://jcenter.bintray.com",
142+
"https://repo1.maven.org/maven2",
143+
"https://mirrors.ibiblio.org/pub/mirrors/maven2",
144144
],
145145
)
146146

deps.bzl

Lines changed: 27 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ def archive_dependencies(third_party):
1313
{
1414
"name": "platforms",
1515
"urls": [
16-
"https://mirror.bazel.build/github.com/bazelbuild/platforms/releases/download/0.0.6/platforms-0.0.6.tar.gz",
17-
"https://github.com/bazelbuild/platforms/releases/download/0.0.6/platforms-0.0.6.tar.gz",
16+
"https://mirror.bazel.build/github.com/bazelbuild/platforms/releases/download/0.0.7/platforms-0.0.7.tar.gz",
17+
"https://github.com/bazelbuild/platforms/releases/download/0.0.7/platforms-0.0.7.tar.gz",
1818
],
19-
"sha256": "5308fc1d8865406a49427ba24a9ab53087f17f5266a7aabbfc28823f3916e1ca",
19+
"sha256": "3a561c99e7bdbe9173aa653fd579fe849f1d8d67395780ab4770b1f381431d51",
2020
},
2121
{
2222
"name": "rules_jvm_external",
@@ -55,9 +55,9 @@ def archive_dependencies(third_party):
5555
# Needed for @grpc_java//compiler:grpc_java_plugin.
5656
{
5757
"name": "io_grpc_grpc_java",
58-
"sha256": "78bf175f9a8fa23cda724bbef52ad9d0d555cdd1122bcb06484b91174f931239",
59-
"strip_prefix": "grpc-java-1.54.1",
60-
"urls": ["https://github.com/grpc/grpc-java/archive/v1.54.1.zip"],
58+
"sha256": "b8fb7ae4824fb5a5ae6e6fa26ffe2ad7ab48406fdeee54e8965a3b5948dd957e",
59+
"strip_prefix": "grpc-java-1.56.1",
60+
"urls": ["https://github.com/grpc/grpc-java/archive/v1.56.1.zip"],
6161
},
6262
{
6363
"name": "rules_pkg",
@@ -111,10 +111,29 @@ def archive_dependencies(third_party):
111111
"patch_args": ["-p1"],
112112
"patches": ["%s:clang_toolchain.patch" % third_party],
113113
},
114+
115+
# Used to build release container images
114116
{
115117
"name": "io_bazel_rules_docker",
116118
"sha256": "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
117119
"urls": ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
120+
"patch_args": ["-p0"],
121+
"patches": ["%s:docker_go_toolchain.patch" % third_party],
122+
},
123+
124+
# Updated versions of io_bazel_rules_docker dependencies for bazel compatibility
125+
{
126+
"name": "io_bazel_rules_go",
127+
"sha256": "278b7ff5a826f3dc10f04feaf0b70d48b68748ccd512d7f98bf442077f043fe3",
128+
"urls": [
129+
"https://mirror.bazel.build/github.com/bazelbuild/rules_go/releases/download/v0.41.0/rules_go-v0.41.0.zip",
130+
"https://github.com/bazelbuild/rules_go/releases/download/v0.41.0/rules_go-v0.41.0.zip",
131+
],
132+
},
133+
{
134+
"name": "bazel_gazelle",
135+
"sha256": "d3fa66a39028e97d76f9e2db8f1b0c11c099e8e01bf363a923074784e451f809",
136+
"urls": ["https://github.com/bazelbuild/bazel-gazelle/releases/download/v0.33.0/bazel-gazelle-v0.33.0.tar.gz"],
118137
},
119138

120139
# Bazel is referenced as a dependency so that buildfarm can access the linux-sandbox as a potential execution wrapper.
@@ -188,9 +207,9 @@ def buildfarm_dependencies(repository_name = "build_buildfarm"):
188207
maybe(
189208
http_jar,
190209
"opentelemetry",
191-
sha256 = "0523287984978c091be0d22a5c61f0bce8267eeafbbae58c98abaf99c9396832",
210+
sha256 = "eccd069da36031667e5698705a6838d173d527a5affce6cc514a14da9dbf57d7",
192211
urls = [
193-
"https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v1.11.0/opentelemetry-javaagent.jar",
212+
"https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v1.28.0/opentelemetry-javaagent.jar",
194213
],
195214
)
196215

examples/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,6 @@ server:
3838
admin:
3939
deploymentEnvironment: AWS
4040
clusterEndpoint: "grpc://localhost"
41-
enableGracefulShutdown: false
4241
metrics:
4342
publisher: LOG
4443
logLevel: FINEST
@@ -126,6 +125,7 @@ worker:
126125
onlyMulticoreTests: false
127126
allowBringYourOwnContainer: false
128127
errorOperationRemainingResources: false
128+
gracefulShutdownSeconds: 0
129129
sandboxSettings:
130130
alwaysUse: false
131131
selectForBlockNetwork: false
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
charts
2+
Chart.lock
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Patterns to ignore when building packages.
2+
# This supports shell glob matching, relative path matching, and
3+
# negation (prefixed with !). Only one pattern per line.
4+
.DS_Store
5+
# Common VCS dirs
6+
.git/
7+
.gitignore
8+
.bzr/
9+
.bzrignore
10+
.hg/
11+
.hgignore
12+
.svn/
13+
# Common backup files
14+
*.swp
15+
*.bak
16+
*.tmp
17+
*.orig
18+
*~
19+
# Various IDEs
20+
.project
21+
.idea/
22+
*.tmproj
23+
.vscode/
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
apiVersion: v2
2+
name: buildfarm
3+
description: A Helm chart for bazel buildfarm
4+
5+
# A chart can be either an 'application' or a 'library' chart.
6+
#
7+
# Application charts are a collection of templates that can be packaged into versioned archives
8+
# to be deployed.
9+
#
10+
# Library charts provide useful utilities or functions for the chart developer. They're included as
11+
# a dependency of application charts to inject those utilities and functions into the rendering
12+
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
13+
type: application
14+
15+
# This is the chart version. This version number should be incremented each time you make changes
16+
# to the chart and its templates, including the app version.
17+
# Versions are expected to follow Semantic Versioning (https://semver.org/)
18+
version: 0.1.0
19+
20+
# This is the version number of the application being deployed. This version number should be
21+
# incremented each time you make changes to the application. Versions are not expected to
22+
# follow Semantic Versioning. They should reflect the version the application is using.
23+
# It is recommended to use it with quotes.
24+
appVersion: "v2.5.0"
25+
26+
dependencies:
27+
- condition: redis.enabled
28+
name: redis
29+
repository: https://charts.helm.sh/stable
30+
version: 10.5.7

0 commit comments

Comments
 (0)