Skip to content

Commit 11767d9

Browse files
committed
fix: add docs and tag olso non jit instances
1 parent 88cb58d commit 11767d9

File tree

2 files changed

+170
-0
lines changed

2 files changed

+170
-0
lines changed
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
{
2+
"Version": "2012-10-17",
3+
"Statement": [
4+
{
5+
"Action": "ec2:CreateTags",
6+
"Condition": {
7+
"ForAllValues:StringEquals": {
8+
"aws:TagKeys": [
9+
"ghr:github_runner_id"
10+
]
11+
},
12+
"StringEquals": {
13+
"aws:ARN": "$${ec2:SourceInstanceARN}"
14+
}
15+
},
16+
"Effect": "Allow",
17+
"Resource": "arn:*:ec2:*:*:instance/*"
18+
}
19+
]
20+
}
Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
# GitHub Actions Runner Scale-Down State Diagram
2+
3+
<!-- --8<-- [start:mkdocs_scale_down_state_diagram] -->
4+
5+
The scale-down Lambda function runs on a scheduled basis (every 5 minutes by default) to manage GitHub Actions runner instances. It performs a two-phase cleanup process: first terminating confirmed orphaned instances, then evaluating active runners to maintain the desired idle capacity while removing unnecessary instances.
6+
7+
```mermaid
8+
stateDiagram-v2
9+
[*] --> ScheduledExecution : Cron Trigger every 5 min
10+
11+
ScheduledExecution --> Phase1_OrphanTermination : Start Phase 1
12+
13+
state Phase1_OrphanTermination {
14+
[*] --> ListOrphanInstances : Query EC2 for ghr orphan true
15+
16+
ListOrphanInstances --> CheckOrphanType : For each orphan
17+
18+
state CheckOrphanType <<choice>>
19+
CheckOrphanType --> HasRunnerIdTag : Has ghr github runner id
20+
CheckOrphanType --> TerminateOrphan : No runner ID tag
21+
22+
HasRunnerIdTag --> LastChanceCheck : Query GitHub API
23+
24+
state LastChanceCheck <<choice>>
25+
LastChanceCheck --> ConfirmedOrphan : Offline and busy
26+
LastChanceCheck --> FalsePositive : Exists and not problematic
27+
28+
ConfirmedOrphan --> TerminateOrphan
29+
FalsePositive --> RemoveOrphanTag
30+
31+
TerminateOrphan --> NextOrphan : Continue processing
32+
RemoveOrphanTag --> NextOrphan
33+
34+
NextOrphan --> CheckOrphanType : More orphans?
35+
NextOrphan --> Phase2_ActiveRunners : All processed
36+
}
37+
38+
Phase1_OrphanTermination --> Phase2_ActiveRunners : Phase 1 Complete
39+
40+
state Phase2_ActiveRunners {
41+
[*] --> ListActiveRunners : Query non-orphan EC2 instances
42+
43+
ListActiveRunners --> GroupByOwner : Sort by owner and repo
44+
45+
GroupByOwner --> ProcessOwnerGroup : For each owner
46+
47+
state ProcessOwnerGroup {
48+
[*] --> SortByStrategy : Apply eviction strategy
49+
SortByStrategy --> ProcessRunner : Oldest first or newest first
50+
51+
ProcessRunner --> QueryGitHub : Get GitHub runners for owner
52+
53+
QueryGitHub --> MatchRunner : Find runner by instance ID suffix
54+
55+
state MatchRunner <<choice>>
56+
MatchRunner --> FoundInGitHub : Runner exists in GitHub
57+
MatchRunner --> NotFoundInGitHub : Runner not in GitHub
58+
59+
state FoundInGitHub {
60+
[*] --> CheckMinimumTime : Has minimum runtime passed?
61+
62+
state CheckMinimumTime <<choice>>
63+
CheckMinimumTime --> TooYoung : Runtime less than minimum
64+
CheckMinimumTime --> CheckIdleQuota : Runtime greater than or equal to minimum
65+
66+
TooYoung --> NextRunner
67+
68+
state CheckIdleQuota <<choice>>
69+
CheckIdleQuota --> KeepIdle : Idle quota available
70+
CheckIdleQuota --> CheckBusyState : Quota full
71+
72+
KeepIdle --> NextRunner
73+
74+
state CheckBusyState <<choice>>
75+
CheckBusyState --> KeepBusy : Runner busy
76+
CheckBusyState --> TerminateIdle : Runner idle
77+
78+
KeepBusy --> NextRunner
79+
TerminateIdle --> DeregisterFromGitHub
80+
DeregisterFromGitHub --> TerminateInstance
81+
TerminateInstance --> NextRunner
82+
}
83+
84+
state NotFoundInGitHub {
85+
[*] --> CheckBootTime : Has boot time exceeded?
86+
87+
state CheckBootTime <<choice>>
88+
CheckBootTime --> StillBooting : Boot time less than threshold
89+
CheckBootTime --> MarkOrphan : Boot time greater than or equal to threshold
90+
91+
StillBooting --> NextRunner
92+
MarkOrphan --> TagAsOrphan : Set ghr orphan true
93+
TagAsOrphan --> NextRunner
94+
}
95+
96+
NextRunner --> ProcessRunner : More runners in group?
97+
NextRunner --> NextOwnerGroup : Group complete
98+
}
99+
100+
NextOwnerGroup --> ProcessOwnerGroup : More owner groups?
101+
NextOwnerGroup --> ExecutionComplete : All groups processed
102+
}
103+
104+
Phase2_ActiveRunners --> ExecutionComplete : Phase 2 Complete
105+
106+
ExecutionComplete --> [*] : Wait for next cron trigger
107+
108+
note right of LastChanceCheck
109+
Uses ghr github runner id tag
110+
for precise GitHub API lookup
111+
end note
112+
113+
note right of MatchRunner
114+
Matches GitHub runner name
115+
ending with EC2 instance ID
116+
end note
117+
118+
note right of CheckMinimumTime
119+
Minimum running time in minutes
120+
(Linux: 5min, Windows: 15min)
121+
end note
122+
123+
note right of CheckBootTime
124+
Runner boot time in minutes
125+
Default configuration value
126+
end note
127+
```
128+
<!-- --8<-- [end:mkdocs_scale_down_state_diagram] -->
129+
130+
131+
## Key Decision Points
132+
133+
| State | Condition | Action |
134+
|-------|-----------|--------|
135+
| **Orphan w/ Runner ID** | GitHub: offline + busy | Terminate (confirmed orphan) |
136+
| **Orphan w/ Runner ID** | GitHub: exists + healthy | Remove orphan tag (false positive) |
137+
| **Orphan w/o Runner ID** | Always | Terminate (no way to verify) |
138+
| **Active Runner Found** | Runtime < minimum | Keep (too young) |
139+
| **Active Runner Found** | Idle quota available | Keep as idle |
140+
| **Active Runner Found** | Quota full + idle | Terminate + deregister |
141+
| **Active Runner Found** | Quota full + busy | Keep running |
142+
| **Active Runner Missing** | Boot time exceeded | Mark as orphan |
143+
| **Active Runner Missing** | Still booting | Wait |
144+
145+
## Configuration Parameters
146+
147+
- **Cron Schedule**: `cron(*/5 * * * ? *)` (every 5 minutes)
148+
- **Minimum Runtime**: Linux 5min, Windows 15min
149+
- **Boot Timeout**: Configurable via `runner_boot_time_in_minutes`
150+
- **Idle Config**: Per-environment configuration for desired idle runners

0 commit comments

Comments
 (0)