Skip to content

Commit c89765d

Browse files
authored
Create starting-phoenix-runners.md
1 parent 15d3f41 commit c89765d

File tree

1 file changed

+110
-0
lines changed

1 file changed

+110
-0
lines changed

misc/starting-phoenix-runners.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Launching Phoenix Runners
2+
3+
The Phoenix runners were repeatedly failing due to a network error.
4+
Spencer managed to fix it via [this PR](https://github.com/MFlowCode/MFC/pull/933) and by running things through a socks5 proxy on each login node that holds a runner.
5+
These are documented for Spencer or his next of kin.
6+
7+
__The runners are started via the following process__
8+
9+
1. Log in to the login node <x> via `ssh login-phoenix-rh9-<x>.pace.gatech.edu`. `<x>` can be `1` through `6` on Phoenix.
10+
* Detour: Make sure no stray `ssh` daemons are sitting around: `pkill -9 sshd`.
11+
* You can probably keep your terminal alive via `fuser -k -9 ~/nohup.out`, which kills (signal 9) whatever process is writing to that no-hangup file (the daemon we care about)
12+
2. Log back into the same login node because you may have just nuked your session
13+
* Detour: Make sure stray runners on that login node are dead (one liner): `pkill -9 -f -E 'run.sh|Runner.listener|Runner.helper'`
14+
* If cautious, check that no runner processes are left over. `top` followed by `u` and `<type your user name>` and return.
15+
3. Execute from your home directory: `nohup ssh -N -D 1080 -vvv login-phoenix-rh9-<x>.pace.gatech.edu &`, replacing `<x>` with the login node number
16+
* This starts a proxy to tunnel a new ssh session through
17+
4. Navigate to your runner's directory (or create a runner directory if you need).
18+
* Right now they are in Spencer's `scratch/mfc-runners/action-runner-<runner#>`
19+
5. Run the alias `start_runner`, which dumps output `~/runner.out`
20+
* If one doesn't have this alias yet, create and source it in your `.bashrc` or similar:
21+
```bash
22+
alias start_runner=' \
23+
http_proxy="socks5://localhost:1080" \
24+
https_proxy="socks5://localhost:1080" \
25+
no_proxy="localhost,127.0.0.1,github.com,api.github.com,pipelines.actions.githubusercontent.com,alive.github.com,pypi.org,files.pythonhosted.org,fftw.org,www.fftw.org" \
26+
NO_PROXY="localhost,127.0.0.1,github.com,api.github.com,pipelines.actions.githubusercontent.com,alive.github.com,pypi.org,files.pythonhosted.org,fftw.org,www.fftw.org" \
27+
RUNNER_DEBUG=1 \
28+
ACTIONS_STEP_DEBUG=1 \
29+
GITHUB_ACTIONS_RUNNER_PREFER_IP_FAMILY=ipv4 \
30+
DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_TIME=00:01:00 \
31+
DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_INTERVAL=00:00:20 \
32+
DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_RETRYCOUNT=5 \
33+
nohup ./run.sh > ~/runner.out 2>&1 &'
34+
```
35+
6. You're done
36+
37+
38+
### For inquisitive minds
39+
40+
__Why the `start_runner` alias?__
41+
42+
1. `alias start_runner='…'`
43+
Defines a new shell alias named `start_runner`. Whenever you run `start_runner`, the shell will execute everything between the single quotes as if you’d typed it at the prompt.
44+
45+
2. `http_proxy="socks5://localhost:1080"`
46+
Sets the `http_proxy` environment variable so that any HTTP traffic from the runner is sent through a SOCKS5 proxy listening on `localhost:1080`.
47+
48+
3. `https_proxy="socks5://localhost:1080"`
49+
Tells HTTPS-aware tools to use that same local SOCKS5 proxy for HTTPS requests.
50+
51+
4. `no_proxy="localhost,127.0.0.1,github.com,api.github.com,pipelines.actions.githubusercontent.com,alive.github.com,pypi.org,files.pythonhosted.org,fftw.org,www.fftw.org"`
52+
Lists hosts and domains that should bypass the proxy entirely. Commonly used for internal or high-volume endpoints where you don’t want proxy overhead.
53+
54+
5. `NO_PROXY="localhost,127.0.0.1,github.com,api.github.com,pipelines.actions.githubusercontent.com,alive.github.com,pypi.org,files.pythonhosted.org,fftw.org,www.fftw.org"`
55+
Same list as `no_proxy`—some programs only check the uppercase `NO_PROXY` variable.
56+
57+
6. `RUNNER_DEBUG=1`
58+
Enables debug-level logging in the GitHub Actions runner itself, so you’ll see more verbose internal messages in its logs.
59+
60+
7. `ACTIONS_STEP_DEBUG=1`
61+
Turns on step-level debug logging for actions you invoke—handy if you need to trace exactly what each action is doing under the hood.
62+
63+
8. `GITHUB_ACTIONS_RUNNER_PREFER_IP_FAMILY=ipv4`
64+
Forces the runner to resolve DNS names to IPv4 addresses only. Useful if your proxy or network has spotty IPv6 support.
65+
66+
9. `DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_TIME=00:01:00`
67+
For .NET–based tasks: sets the initial TCP keepalive timeout to 1 minute (after 1 minute of idle, a keepalive probe is sent).
68+
69+
10. `DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_INTERVAL=00:00:20`
70+
If the first keepalive probe gets no response, wait 20 seconds between subsequent probes.
71+
72+
11. `DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_RETRYCOUNT=5`
73+
If probes continue to go unanswered, retry up to 5 times before declaring the connection dead.
74+
75+
12. `nohup ./run.sh > ~/runner.out 2>&1 &`
76+
- `nohup … &` runs `./run.sh` in the background and makes it immune to hangups (so it keeps running if you log out).
77+
- `> ~/runner.out` redirects **stdout** to the file `runner.out` in your home directory.
78+
- `2>&1` redirects **stderr** into the same file, so you get a combined log of everything the script prints.
79+
80+
__Why the extra ssh command?__
81+
82+
1. `http_proxy="socks5://localhost:1080"`
83+
Routes all HTTP traffic through a local SOCKS5 proxy on port 1080.
84+
85+
2. `https_proxy="socks5://localhost:1080"`
86+
Routes all HTTPS traffic through the same proxy.
87+
88+
3. `no_proxy="localhost,127.0.0.1,github.com,api.github.com,pipelines.actions.githubusercontent.com,alive.github.com,pypi.org,files.pythonhosted.org,fftw.org,www.fftw.org"`
89+
Specifies hosts and domains that bypass the proxy entirely. Includes specific things that MFC's CMake will try to `wget` (e.g., `fftw`) or some other non `git` command. Allows `git clone` to work.
90+
91+
4. `NO_PROXY="localhost,127.0.0.1,github.com,api.github.com,pipelines.actions.githubusercontent.com,alive.github.com,pypi.org,files.pythonhosted.org,fftw.org,www.fftw.org"`
92+
Same bypass list for applications that only check the uppercase variable.
93+
94+
5. `RUNNER_DEBUG=1`
95+
Enables verbose internal logging in the GitHub Actions runner.
96+
97+
6. `GITHUB_ACTIONS_RUNNER_PREFER_IP_FAMILY=ipv4`
98+
Forces DNS resolution to IPv4 to avoid IPv6 issues.
99+
100+
7. `DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_TIME=00:01:00`
101+
(For .NET tasks) sends the first TCP keepalive probe after 1 minute of idle.
102+
103+
8. `DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_INTERVAL=00:00:20`
104+
Waits 20 seconds between subsequent TCP keepalive probes.
105+
106+
9. `DOTNET_SYSTEM_NET_SOCKETS_KEEPALIVE_RETRYCOUNT=5`
107+
Retries keepalive probes up to 5 times before closing the connection.
108+
109+
10. `nohup ./run.sh > ~/runner.out 2>&1 &`
110+
Runs `run.sh` in the background, immune to hangups, redirecting both stdout and stderr to `~/runner.out`.

0 commit comments

Comments
 (0)