Skip to content

Commit eff7820

Browse files
[CI] Upstream premerge terraform configuration
This patch contains the terraform configuration for the premerge infrastructure. Actually applying changes requires someone from within Google with access to the GCP project to apply the changes, but the entire infrastructure is described within the TF in this patch.
1 parent 1290e95 commit eff7820

File tree

7 files changed

+589
-0
lines changed

7 files changed

+589
-0
lines changed

.ci/infrastructure/README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Premerge Infrastructure
2+
3+
This folder contains the terraform configuration files that define the GCP
4+
resources used to run the premerge checks. Currently, only Google employees
5+
with access to the GCP project where these checks are hosted are able to apply
6+
changes. Pull requests from anyone are still welcome.
7+
8+
## Setup
9+
10+
- install terraform (https://developer.hashicorp.com/terraform/install?product_intent=terraform)
11+
- get the GCP tokens: `gcloud auth application-default login`
12+
- initialize terraform: `terraform init`
13+
14+
To apply any changes to the cluster:
15+
- setup the cluster: `terraform apply`
16+
- terraform will list the list of proposed changes.
17+
- enter 'yes' when prompted.
18+
19+
## Setting the cluster up for the first time
20+
21+
```
22+
terraform apply -target google_container_node_pool.llvm_premerge_linux_service
23+
terraform apply -target google_container_node_pool.llvm_premerge_linux
24+
terraform apply -target google_container_node_pool.llvm_premerge_windows
25+
terraform apply
26+
```
27+
28+
Setting the cluster up for the first time is more involved as there are certain
29+
resources where terraform is unable to handle explicit dependencies. This means
30+
that we have to set up the GKE cluster before we setup any of the Kubernetes
31+
resources as otherwise the Terraform Kubernetes provider will error out.
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
metrics:
2+
enabled: true
3+
alloy:
4+
metricsTuning:
5+
useIntegrationAllowList: true
6+
cost:
7+
enabled: true
8+
kepler:
9+
enabled: true
10+
node-exporter:
11+
enabled: true
12+
logs:
13+
enabled: true
14+
pod_logs:
15+
enabled: true
16+
cluster_events:
17+
enabled: true
18+
traces:
19+
enabled: true
20+
receivers:
21+
grpc:
22+
enabled: true
23+
http:
24+
enabled: true
25+
zipkin:
26+
enabled: true
27+
grafanaCloudMetrics:
28+
enabled: false
29+
opencost:
30+
enabled: true
31+
kube-state-metrics:
32+
enabled: true
33+
prometheus-node-exporter:
34+
enabled: true
35+
prometheus-operator-crds:
36+
enabled: true
37+
kepler:
38+
enabled: true
39+
alloy: {}
40+
alloy-events: {}
41+
alloy-logs: {}
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
spec:
2+
tolerations:
3+
- key: "premerge-platform"
4+
operator: "Equal"
5+
value: "linux"
6+
effect: "NoSchedule"
7+
nodeSelector:
8+
premerge-platform: linux
9+
containers:
10+
- name: $job
11+
resources:
12+
# The container is always scheduled on the same pod as the runner.
13+
# Since we use the runner requests.cpu for scheduling/autoscaling,
14+
# the request here should be set to something small.
15+
#
16+
# The limit however should be the number of cores of the node. Any limit
17+
# inferior to the number of core could slow down the job.
18+
#
19+
# For memory however, the request/limits shall be correct.
20+
# It's not used for scheduling, but can be used by k8 for OOM kill.
21+
requests:
22+
cpu: "100m"
23+
memory: "50Gi"
24+
limits:
25+
cpu: 56
26+
memory: "100Gi"
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
githubConfigUrl: "https://github.com/keenuts-test-org/llvm-ci-testing"
2+
githubConfigSecret: "github-token"
3+
4+
minRunners: 0
5+
maxRunners: 4
6+
7+
containerMode:
8+
type: "kubernetes"
9+
kubernetesModeWorkVolumeClaim:
10+
accessModes: ["ReadWriteOnce"]
11+
storageClassName: "standard-rwo"
12+
resources:
13+
requests:
14+
storage: "100Gi"
15+
kubernetesModeServiceAccount:
16+
annotations:
17+
18+
template:
19+
spec:
20+
tolerations:
21+
- key: "premerge-platform"
22+
operator: "Equal"
23+
value: "linux"
24+
effect: "NoSchedule"
25+
nodeSelector:
26+
premerge-platform: linux
27+
containers:
28+
- name: runner
29+
image: ghcr.io/actions/actions-runner:latest
30+
command: ["/home/runner/run.sh"]
31+
resources:
32+
# The container will be scheduled on the same node as this runner.
33+
# This means if we don't set the CPU request high-enough here, 2
34+
# containers will be scheduled on the same pod, meaning 2 jobs.
35+
#
36+
# This number should be:
37+
# - greater than number_of_cores / 2:
38+
# A value lower than that could allow the scheduler to put 2
39+
# runners in the same pod. Meaning 2 containers in the same pod.
40+
# Meaning 2 jobs sharing the resources.
41+
# - lower than number_of_cores:
42+
# Each pod has some basic services running (metrics for ex). Those
43+
# already require some amount of CPU (~0.5). This means we don't
44+
# exactly have N cores to allocate, but N - epsilon.
45+
#
46+
# Memory however shall be handled at the container level. The runner
47+
# itself doesn't need much, just using something enough not to get
48+
# OOM killed.
49+
requests:
50+
cpu: 50
51+
memory: "2Gi"
52+
limits:
53+
cpu: 56
54+
memory: "2Gi"
55+
env:
56+
- name: ACTIONS_RUNNER_CONTAINER_HOOKS
57+
value: /home/runner/k8s/index.js
58+
- name: ACTIONS_RUNNER_POD_NAME
59+
valueFrom:
60+
fieldRef:
61+
fieldPath: metadata.name
62+
- name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
63+
value: "true"
64+
- name: ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATE
65+
value: "/home/runner/pod-config/linux-container-pod-template.yaml"
66+
volumeMounts:
67+
- name: container-pod-config
68+
mountPath: /home/runner/pod-config
69+
securityContext:
70+
fsGroup: 123
71+
volumes:
72+
- name: container-pod-config
73+
configMap:
74+
name: linux-container-pod-template

0 commit comments

Comments
 (0)