Skip to content

Commit 151eda0

Browse files
committed
Add randomized suffix to runner names
BPF CI runners went down recently due to a problem caused by "Runner not found" error returned from a github broker service [1]. It appears that this error may be triggered by the fact that runner names are reused for "different" runners from the perspective of github. In BPF CI we use ephemeral runners which re-register every time a corresponding runner container is restarted. Whatever the root cause, a working mitigation is to use unique runner names. Fortunately, entrypoint.sh [1] already has the logic of generating a random suffix, and we only need to set relevant environment variables. Additionally, set RestartPreventExitStatus=199 in the runner systemd service to prevent error looping in case we run out of tokens [2]. [1] https://github.com/myoung34/docker-github-actions-runner/blob/2.323.0/entrypoint.sh [2] kernel-patches/runner#75 Signed-off-by: Ihor Solodrai <[email protected]>
1 parent 6c34600 commit 151eda0

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

ansible/roles/runner/tasks/main.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,8 @@
105105
RUNNER_SCOPE=org
106106
ORG_NAME={{ item.0.name }}
107107
{% endif %}
108-
RUNNER_NAME={{ runner_name_prefix }}-{{ 'worker-%02d' | format(item.1) }}
108+
RUNNER_NAME_PREFIX={{ runner_name_prefix }}-{{ 'worker-%02d' | format(item.1) }}
109+
RANDOM_RUNNER_SUFFIX=true
109110
DISABLE_AUTO_UPDATE=true
110111
loop: "{{ runners | subelements('workers') }}"
111112

@@ -139,6 +140,7 @@
139140
[Service]
140141
TimeoutStartSec=0
141142
Restart=always
143+
RestartPreventExitStatus=199
142144
EnvironmentFile={{ runner_base_dir }}/runner_unit.env
143145
# Optionally loaded file. Use this to override per runner environment
144146
EnvironmentFile=-{{ runner_base_dir }}/runner_unit-%i.env

0 commit comments

Comments
 (0)