Skip to content

Commit 04fa939

Browse files
authored
Merge pull request #108 from udx/supervisord-config-improvements
supervisord program improvement: Ensure child processes are terminated on stop/restart
2 parents 85bbe3f + 2053203 commit 04fa939

File tree

5 files changed

+53
-31
lines changed

5 files changed

+53
-31
lines changed

.github/workflows/release.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ jobs:
3939
driver: docker-container
4040

4141
- name: Install GitVersion
42-
uses: gittools/actions/gitversion/setup@v4.1.0
42+
uses: gittools/actions/gitversion/setup@v4.2.0
4343
with:
4444
versionSpec: "6.1.0"
4545

@@ -48,7 +48,7 @@ jobs:
4848

4949
- name: Determine Version
5050
id: gitversion
51-
uses: gittools/actions/gitversion/execute@v4.1.0
51+
uses: gittools/actions/gitversion/execute@v4.2.0
5252
with:
5353
useConfigFile: true
5454
configFilePath: ci/git-version.yml

Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,16 +76,16 @@ RUN echo $TZ > /etc/timezone && \
7676
# Install yq (architecture-aware)
7777
RUN ARCH=$(uname -m) && \
7878
if [ "$ARCH" = "x86_64" ]; then ARCH="amd64"; elif [ "$ARCH" = "aarch64" ]; then ARCH="arm64"; fi && \
79-
curl -sL https://github.com/mikefarah/yq/releases/download/v4.48.1/yq_linux_${ARCH}.tar.gz | tar xz && \
79+
curl -sL https://github.com/mikefarah/yq/releases/download/v4.48.2/yq_linux_${ARCH}.tar.gz | tar xz && \
8080
mv yq_linux_${ARCH} /usr/bin/yq && \
8181
rm -rf /tmp/*
8282

8383
# Install Google Cloud SDK (architecture-aware)
8484
RUN ARCH=$(uname -m) && \
8585
if [ "$ARCH" = "x86_64" ]; then \
86-
curl -sSL "https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-545.0.0-linux-x86_64.tar.gz" -o google-cloud-sdk.tar.gz; \
86+
curl -sSL "https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-547.0.0-linux-x86_64.tar.gz" -o google-cloud-sdk.tar.gz; \
8787
elif [ "$ARCH" = "aarch64" ]; then \
88-
curl -sSL "https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-545.0.0-linux-arm.tar.gz" -o google-cloud-sdk.tar.gz; \
88+
curl -sSL "https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-547.0.0-linux-arm.tar.gz" -o google-cloud-sdk.tar.gz; \
8989
fi && \
9090
tar -xzf google-cloud-sdk.tar.gz && \
9191
./google-cloud-sdk/install.sh -q && \

docs/services.md

Lines changed: 36 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -12,22 +12,24 @@ The UDX Worker uses `services.yaml` to define and manage multiple services. Each
1212

1313
## Configuration Structure
1414

15-
| Field | Type | Required | Default | Description |
16-
|-------|------|----------|---------|-------------|
17-
| `kind` | string | Yes | - | Must be `workerService` |
18-
| `version` | string | Yes | - | Must be `udx.io/worker-v1/service` |
19-
| `services` | array | Yes | - | List of service definitions |
15+
| Field | Type | Required | Default | Description |
16+
| ---------- | ------ | -------- | ------- | ---------------------------------- |
17+
| `kind` | string | Yes | - | Must be `workerService` |
18+
| `version` | string | Yes | - | Must be `udx.io/worker-v1/service` |
19+
| `services` | array | Yes | - | List of service definitions |
2020

2121
### Service Definition Fields
2222

23-
| Field | Type | Required | Default | Description |
24-
|-------|------|----------|---------|-------------|
25-
| `name` | string | Yes | - | Unique service identifier |
26-
| `command` | string | Yes | - | Command to execute |
27-
| `ignore` | boolean | No | `false` | Skip service management |
28-
| `autostart` | boolean | No | `true` | Start on worker launch |
29-
| `autorestart` | boolean | No | `false` | Restart on failure |
30-
| `envs` | array | No | `[]` | Environment variables |
23+
| Field | Type | Required | Default | Description |
24+
| ------------- | ------- | -------- | ------- | ------------------------- |
25+
| `name` | string | Yes | - | Unique service identifier |
26+
| `command` | string | Yes | - | Command to execute |
27+
| `ignore` | boolean | No | `false` | Skip service management |
28+
| `autostart` | boolean | No | `true` | Start on worker launch |
29+
| `autorestart` | boolean | No | `false` | Restart on failure |
30+
| `envs` | array | No | `[]` | Environment variables |
31+
| `stopasgroup` | boolean | No | `false` | Stop process group |
32+
| `killasgroup` | boolean | No | `false` | Kill process group |
3133

3234
## Basic Example
3335

@@ -53,9 +55,11 @@ kind: workerService
5355
version: udx.io/worker-v1/service
5456
services:
5557
- name: "api-server"
56-
command: "node api/server.js"
58+
command: "npm start"
5759
autostart: true
5860
autorestart: true
61+
stopasgroup: true
62+
killasgroup: true
5963
envs:
6064
- "PORT=3000"
6165
- "NODE_ENV=production"
@@ -68,7 +72,7 @@ services:
6872

6973
- name: "monitoring"
7074
command: "./monitor.sh"
71-
ignore: true # Temporarily disabled
75+
ignore: true # Temporarily disabled
7276
```
7377
7478
### Service with Complex Command
@@ -87,16 +91,19 @@ services:
8791
## Best Practices
8892
8993
1. **Service Naming**
94+
9095
- Use descriptive, lowercase names
9196
- Separate words with hyphens
9297
- Keep names concise but meaningful
9398
9499
2. **Command Definition**
100+
95101
- Use absolute paths when possible
96102
- Quote commands with spaces or special characters
97103
- Consider using shell scripts for complex commands
98104
99105
3. **Environment Variables**
106+
100107
- Use uppercase for variable names
101108
- Group related variables together
102109
- Document required variables
@@ -110,39 +117,43 @@ services:
110117

111118
When running `worker service list`, services show the following status indicators:
112119

113-
| Symbol | Status | Description |
114-
|--------|---------|-------------|
115-
| ✅ | RUNNING | Service is running normally |
116-
| ⛔ | STOPPED | Service was stopped with `worker service stop` |
117-
| 💀 | FATAL | Service exited (any exit code) |
118-
| 🔄 | RETRY | Service is retrying (with autorestart: true) |
119-
| ⚠️ | STARTING | Service is starting up |
120+
| Symbol | Status | Description |
121+
| ------ | -------- | ---------------------------------------------- |
122+
| ✅ | RUNNING | Service is running normally |
123+
| ⛔ | STOPPED | Service was stopped with `worker service stop` |
124+
| 💀 | FATAL | Service exited (any exit code) |
125+
| 🔄 | RETRY | Service is retrying (with autorestart: true) |
126+
| ⚠️ | STARTING | Service is starting up |
120127

121128
### Service Types
122129

123130
1. **Long-running Services**
131+
124132
```yaml
125133
- name: "web-server"
126134
command: "python server.py"
127-
autorestart: true # Restarts on exit
135+
autorestart: true # Restarts on exit
128136
```
137+
129138
- Shows as RUNNING (✅) while active
130139
- Shows as RETRY (🔄) then FATAL (💀) if it keeps failing
131140

132141
2. **One-shot Tasks**
133142
```yaml
134143
- name: "setup"
135144
command: "./setup.sh"
136-
autorestart: false # Runs once
145+
autorestart: false # Runs once
137146
```
138147
- Use `worker service stop` for clean completion (⛔)
139148
- Otherwise shows as FATAL (💀) after exit
140149

141150
Note: Exit codes (0 or non-zero) don't affect the final status. What matters is:
151+
142152
- Whether the service keeps running (RUNNING ✅)
143153
- How it stops (STOPPED ⛔ vs FATAL 💀)
144154

145155
Example:
156+
146157
```bash
147158
# Long-running service and one-shot task
148159
$ worker service list
@@ -171,4 +182,4 @@ worker service logs web-server
171182
172183
# Restart a service
173184
worker service restart web-server
174-
```
185+
```

etc/configs/supervisor/program.conf

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,10 @@ autorestart=${autorestart}
66
startretries=${startretries}
77
startsecs=1 # Consider process started if it runs for at least 1 second
88

9+
# Kill process group
10+
stopasgroup=${stopasgroup}
11+
killasgroup=${killasgroup}
12+
913
# Logging settings
1014
stdout_logfile=/var/log/supervisor/${process_name}.out.log
1115
stdout_logfile_maxbytes=50MB

lib/process_manager.sh

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,11 @@ parse_service_info() {
6161
autorestart=$(echo "$service_json" | jq -r '.autorestart // "false"')
6262
# Ensure 'envs' defaults to an empty array if not specified
6363
envs=$(echo "$service_json" | jq -r '.envs // [] | join(",")')
64+
65+
# Use 'false' as default for 'stopasgroup' if not specified
66+
stopasgroup=$(echo "$service_json" | jq -r '.stopasgroup // "false"')
67+
# Use 'false' as default for 'killasgroup' if not specified
68+
killasgroup=$(echo "$service_json" | jq -r '.killasgroup // "false"')
6469

6570
# Ignore the service if 'ignore' is set to "true"
6671
if [[ "$ignore" == "true" ]]; then
@@ -95,7 +100,9 @@ parse_service_info() {
95100
s|\${autostart}|$autostart|g; \
96101
s|\${autorestart}|$autorestart|g; \
97102
s|\${startretries}|$startretries|g; \
98-
s|\${envs}|$envs|g" "$PROGRAM_TEMPLATE_FILE" >> "$FINAL_CONFIG"
103+
s|\${envs}|$envs|g; \
104+
s|\${stopasgroup}|$stopasgroup|g; \
105+
s|\${killasgroup}|$killasgroup|g" "$PROGRAM_TEMPLATE_FILE" >> "$FINAL_CONFIG"
99106
}
100107

101108
# Function to check if services are active

0 commit comments

Comments
 (0)