Conversation
Walkthrough[Hushed, measured tone] Here we observe the introduction of an entirely new uptime-service microservice—a carefully constructed ecosystem comprising configuration management, Prometheus metrics collection, periodic health probing, and HTTP server endpoints, orchestrated through a specialized Docker build workflow and deployed via GitHub Actions. Changes
Sequence Diagram(s)sequenceDiagram
participant main as index.js
participant config as config.js
participant server as server.js
participant metrics as metrics.js
participant prober as prober.js
participant endpoint as External Endpoint
participant prometheus as Prometheus Client
main->>config: loadConfig()
config-->>main: config (validated)
main->>server: startMetricsServer(port)
server->>server: Create Express app
server-->>main: server listening
main->>prober: startProbing(config)
Note over prober: Initial probe + periodic interval
loop Every PROBE_INTERVAL
prober->>endpoint: HTTP request with timeout
alt Success
endpoint-->>prober: response (status, time)
prober->>metrics: recordSuccess(name, endpoint, time)
metrics->>prometheus: Update service_up, response_time, timestamp
else Timeout/Failure
prober->>metrics: recordFailure(name, endpoint)
metrics->>prometheus: Update service_up, timestamp
end
end
Note over prober: SIGINT/SIGTERM received
prober->>prober: Clear interval & exit gracefully
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There was a problem hiding this comment.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
ts/uptime-service/bun.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
.github/workflows/build-and-push-uptime-service.yml(1 hunks)ts/pnpm-workspace.yaml(1 hunks)ts/uptime-service/.env.example(1 hunks)ts/uptime-service/.gitignore(1 hunks)ts/uptime-service/Dockerfile(1 hunks)ts/uptime-service/README.md(1 hunks)ts/uptime-service/package.json(1 hunks)ts/uptime-service/src/config.js(1 hunks)ts/uptime-service/src/index.js(1 hunks)ts/uptime-service/src/metrics.js(1 hunks)ts/uptime-service/src/prober.js(1 hunks)ts/uptime-service/src/server.js(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-16T11:14:22.103Z
Learnt from: Marcin-Radecki
Repo: Cardinal-Cryptography/blanksquare-monorepo PR: 48
File: tee/docker/Dockerfile:9-10
Timestamp: 2025-09-16T11:14:22.103Z
Learning: The tee/docker/Dockerfile currently runs as root without tini init system - this is a security hardening opportunity that can be addressed in a future PR by installing tini, creating a non-root user, and updating the ENTRYPOINT.
Applied to files:
ts/uptime-service/Dockerfile
🧬 Code graph analysis (5)
ts/uptime-service/src/prober.js (2)
ts/uptime-service/src/metrics.js (2)
recordSuccess(38-44)recordFailure(49-54)ts/uptime-service/src/config.js (2)
endpoints(21-21)config(7-12)
ts/uptime-service/src/server.js (1)
ts/uptime-service/src/metrics.js (2)
register(8-8)register(8-8)
ts/uptime-service/src/metrics.js (1)
ts/uptime-service/src/prober.js (1)
responseTime(28-28)
ts/uptime-service/src/config.js (1)
ts/uptime-service/src/index.js (1)
config(16-16)
ts/uptime-service/src/index.js (3)
ts/uptime-service/src/config.js (2)
config(7-12)loadConfig(5-84)ts/uptime-service/src/server.js (1)
startMetricsServer(11-64)ts/uptime-service/src/prober.js (1)
startProbing(78-110)
🪛 Hadolint (2.14.0)
ts/uptime-service/Dockerfile
[warning] 19-19: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
[info] 19-19: Avoid additional packages by specifying --no-install-recommends
(DL3015)
⏰ Context from checks skipped due to timeout of 360000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: Rust crates checks / Run tests for crates/shielder-scheduler-common
- GitHub Check: Rust crates checks / Run tests for crates/shielder-prover-common
- GitHub Check: Rust crates checks / Run tests for crates/shielder-scheduler-server
- GitHub Check: Rust crates checks / Run e2e tests
- GitHub Check: Rust crates checks / Run tests for crates/shielder-prover-server
- GitHub Check: Rust crates checks / Run tests for crates/shielder-account
- GitHub Check: Rust crates checks / Run tests for crates/powers-of-tau
- GitHub Check: Rust crates checks / Run tests for crates/shielder-scheduler-tee
- GitHub Check: Rust crates checks / Run tests for crates/shielder-cli
- GitHub Check: Rust crates checks / Run tests for crates/shielder-setup
- GitHub Check: Rust crates checks / Run tests for crates/shielder-circuits
- GitHub Check: Rust crates checks / Run tests for crates/shielder-contract
- GitHub Check: Rust crates checks / Run tests for crates/halo2-verifier
- GitHub Check: Rust crates checks / Run tests for crates/shielder_bindings
- GitHub Check: Rust crates checks / Run lints and format for Rust workspace
- GitHub Check: Typescript modules checks / Run lints and tests
- GitHub Check: Typescript modules checks / Run E2E tests on local anvil node (nonzero fee)
- GitHub Check: Typescript modules checks / Run E2E tests on local anvil node (zero fee)
- GitHub Check: Measure gas usage / Measure gas and verifier code size
| EXPOSE 9615 | ||
|
|
||
| # Health check | ||
| HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ | ||
| CMD bun run -e "fetch('http://localhost:9615/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))" |
There was a problem hiding this comment.
Observe the port-shy service revealing a mismatch. From our hide, the container broadcasts and probes port 9615, yet loadConfig leaves the creature nesting on 9090 by default. Without setting PORT, every healthcheck will cry “unhealthy.” Please align the habitat by exporting the expected port inside the image or updating the runtime default.
+# Ensure the runtime listens on the probed port
+ENV PORT=9615
# Expose metrics port
EXPOSE 9615📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| EXPOSE 9615 | |
| # Health check | |
| HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ | |
| CMD bun run -e "fetch('http://localhost:9615/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))" | |
| # Ensure the runtime listens on the probed port | |
| ENV PORT=9615 | |
| EXPOSE 9615 | |
| # Health check | |
| HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ | |
| CMD bun run -e "fetch('http://localhost:9615/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))" |
🤖 Prompt for AI Agents
In ts/uptime-service/Dockerfile around lines 35 to 39, the HEALTHCHECK and
EXPOSE use port 9615 while the app defaults to port 9090, causing healthchecks
to fail; fix by making the image and runtime agree — either export the expected
PORT (e.g., add ENV PORT=9615) so the service listens on 9615, or change the
EXPOSE and HEALTHCHECK to 9090 to match the app default; apply only one approach
and ensure the chosen port is documented in the container runtime config.
| HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ | ||
| CMD bun run -e "fetch('http://localhost:9615/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))" |
There was a problem hiding this comment.
Behold the healthcheck’s misfire. Here the script calls bun run -e, but Bun’s evaluative call is the solitary bun -e, leaving this probe to stumble before it begins. Switching to bun -e lets the sentinel report health as intended.
- CMD bun run -e "fetch('http://localhost:9615/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))"
+ CMD bun -e "fetch('http://localhost:9615/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ | |
| CMD bun run -e "fetch('http://localhost:9615/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))" | |
| HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ | |
| CMD bun -e "fetch('http://localhost:9615/health').then(r => r.ok ? process.exit(0) : process.exit(1)).catch(() => process.exit(1))" |
🤖 Prompt for AI Agents
In ts/uptime-service/Dockerfile around lines 38 to 39, the HEALTHCHECK uses "bun
run -e" which is incorrect for evaluating inline JS with Bun; change the command
to use "bun -e" (i.e., replace "bun run -e" with "bun -e") so the healthcheck
executes the inline fetch expression correctly and returns proper exit codes.
| port: parseInt(process.env.PORT || "9090", 10), | ||
| probeInterval: parseInt(process.env.PROBE_INTERVAL || "30000", 10), | ||
| timeout: parseInt(process.env.TIMEOUT || "5000", 10), | ||
| endpoints: [] | ||
| }; | ||
|
|
||
| // Parse and validate endpoints | ||
| if (!process.env.ENDPOINTS) { | ||
| console.error("ERROR: ENDPOINTS environment variable is required"); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| try { | ||
| const endpoints = JSON.parse(process.env.ENDPOINTS); | ||
|
|
||
| if (!Array.isArray(endpoints)) { | ||
| throw new Error("ENDPOINTS must be a JSON array"); | ||
| } | ||
|
|
||
| if (endpoints.length === 0) { | ||
| throw new Error("ENDPOINTS array cannot be empty"); | ||
| } | ||
|
|
||
| // Validate each endpoint | ||
| config.endpoints = endpoints.map((endpoint, index) => { | ||
| if (!endpoint.name || typeof endpoint.name !== "string") { | ||
| throw new Error( | ||
| `Endpoint at index ${index} missing required 'name' field` | ||
| ); | ||
| } | ||
| if (!endpoint.url || typeof endpoint.url !== "string") { | ||
| throw new Error( | ||
| `Endpoint at index ${index} missing required 'url' field` | ||
| ); | ||
| } | ||
|
|
||
| return { | ||
| name: endpoint.name, | ||
| url: endpoint.url, | ||
| method: endpoint.method || "GET", | ||
| expectedStatus: endpoint.expectedStatus || 200 | ||
| }; | ||
| }); | ||
|
|
||
| // Check for duplicate names | ||
| const names = config.endpoints.map((e) => e.name); | ||
| const duplicates = names.filter( | ||
| (name, index) => names.indexOf(name) !== index | ||
| ); | ||
| if (duplicates.length > 0) { | ||
| throw new Error( | ||
| `Duplicate endpoint names found: ${duplicates.join(", ")}` | ||
| ); | ||
| } | ||
| } catch (error) { | ||
| console.error("ERROR: Failed to parse ENDPOINTS:", error.message); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| // Validate config values | ||
| if (config.port < 1 || config.port > 65535) { | ||
| console.error("ERROR: PORT must be between 1 and 65535"); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| if (config.probeInterval < 1000) { | ||
| console.error("ERROR: PROBE_INTERVAL must be at least 1000ms"); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| if (config.timeout < 100) { | ||
| console.error("ERROR: TIMEOUT must be at least 100ms"); | ||
| process.exit(1); | ||
| } |
There was a problem hiding this comment.
A NaN slinks through the underbrush, evading validation. When this habitat receives something like PORT=mystery, parseInt produces NaN, and the later bounds checks at Lines 68-80 never stir—comparisons with NaN are always false. The service then wanders onward clutching NaN into app.listen, which will crash, and into setInterval, which collapses to a zero-delay stampede. Similar ghosts can haunt PROBE_INTERVAL and TIMEOUT. Invite firm sentries with explicit Number.isNaN checks before these values migrate further.
- const config = {
- port: parseInt(process.env.PORT || "9090", 10),
- probeInterval: parseInt(process.env.PROBE_INTERVAL || "30000", 10),
- timeout: parseInt(process.env.TIMEOUT || "5000", 10),
- endpoints: []
- };
+ const config = {
+ port: Number.parseInt(process.env.PORT ?? "9090", 10),
+ probeInterval: Number.parseInt(process.env.PROBE_INTERVAL ?? "30000", 10),
+ timeout: Number.parseInt(process.env.TIMEOUT ?? "5000", 10),
+ endpoints: []
+ };
+
+ if (Number.isNaN(config.port)) {
+ console.error("ERROR: PORT must be a number");
+ process.exit(1);
+ }
+
+ if (Number.isNaN(config.probeInterval)) {
+ console.error("ERROR: PROBE_INTERVAL must be a number");
+ process.exit(1);
+ }
+
+ if (Number.isNaN(config.timeout)) {
+ console.error("ERROR: TIMEOUT must be a number");
+ process.exit(1);
+ }🤖 Prompt for AI Agents
In ts/uptime-service/src/config.js around lines 8 to 81, the numeric env parses
for PORT, PROBE_INTERVAL and TIMEOUT use parseInt but never check for NaN, so
invalid values (e.g. PORT=mystery) bypass later bounds checks; after parsing
each value (port, probeInterval, timeout) validate with Number.isNaN and if any
is NaN log a clear error (including the env name and raw value) and call
process.exit(1); keep the existing bounds checks afterwards so valid numbers are
still range-validated.
| try { | ||
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), timeout); | ||
|
|
||
| const response = await fetch(endpoint.url, { | ||
| method: endpoint.method, | ||
| signal: controller.signal, | ||
| headers: { | ||
| "User-Agent": "uptime-service/1.0" | ||
| } | ||
| }); | ||
|
|
||
| clearTimeout(timeoutId); | ||
|
|
||
| const endTime = Date.now(); | ||
| const responseTime = (endTime - startTime) / 1000; // Convert to seconds | ||
|
|
||
| // Check if status code matches expected | ||
| if (response.status === endpoint.expectedStatus) { | ||
| recordSuccess(endpoint.name, endpoint.url, responseTime); | ||
| console.log( | ||
| `[${new Date().toISOString()}] ✓ ${endpoint.name} - UP (${ | ||
| response.status | ||
| }, ${responseTime.toFixed(3)}s)` | ||
| ); | ||
| return true; | ||
| } else { | ||
| recordFailure(endpoint.name, endpoint.url); | ||
| console.log( | ||
| `[${new Date().toISOString()}] ✗ ${endpoint.name} - DOWN (status: ${ | ||
| response.status | ||
| }, expected: ${endpoint.expectedStatus})` | ||
| ); | ||
| return false; | ||
| } | ||
| } catch (error) { | ||
| recordFailure(endpoint.name, endpoint.url); | ||
|
|
||
| let errorMessage = error.message; | ||
| if (error.name === "AbortError") { | ||
| errorMessage = "timeout"; | ||
| } | ||
|
|
||
| console.log( | ||
| `[${new Date().toISOString()}] ✗ ${ | ||
| endpoint.name | ||
| } - DOWN (${errorMessage})` | ||
| ); | ||
| return false; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
A lingering timeout rustles the canopy. When a probe fails before the allotted interval, the timeout trap remains set, only to spring harmlessly later and abort an already-fallen fetch. Over time, these ghostly timers clutter the understory. Hoist the controller and timeout handle, then clear it in a finally block so every path tidies the terrain.
- try {
- const controller = new AbortController();
- const timeoutId = setTimeout(() => controller.abort(), timeout);
+ const controller = new AbortController();
+ let timeoutId;
+
+ try {
+ timeoutId = setTimeout(() => controller.abort(), timeout);
const response = await fetch(endpoint.url, {
method: endpoint.method,
signal: controller.signal,
headers: {
"User-Agent": "uptime-service/1.0"
}
});
-
- clearTimeout(timeoutId);
+ clearTimeout(timeoutId);
@@
- } catch (error) {
+ } catch (error) {
recordFailure(endpoint.name, endpoint.url);
@@
- }
+ } finally {
+ if (timeoutId) {
+ clearTimeout(timeoutId);
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| const controller = new AbortController(); | |
| const timeoutId = setTimeout(() => controller.abort(), timeout); | |
| const response = await fetch(endpoint.url, { | |
| method: endpoint.method, | |
| signal: controller.signal, | |
| headers: { | |
| "User-Agent": "uptime-service/1.0" | |
| } | |
| }); | |
| clearTimeout(timeoutId); | |
| const endTime = Date.now(); | |
| const responseTime = (endTime - startTime) / 1000; // Convert to seconds | |
| // Check if status code matches expected | |
| if (response.status === endpoint.expectedStatus) { | |
| recordSuccess(endpoint.name, endpoint.url, responseTime); | |
| console.log( | |
| `[${new Date().toISOString()}] ✓ ${endpoint.name} - UP (${ | |
| response.status | |
| }, ${responseTime.toFixed(3)}s)` | |
| ); | |
| return true; | |
| } else { | |
| recordFailure(endpoint.name, endpoint.url); | |
| console.log( | |
| `[${new Date().toISOString()}] ✗ ${endpoint.name} - DOWN (status: ${ | |
| response.status | |
| }, expected: ${endpoint.expectedStatus})` | |
| ); | |
| return false; | |
| } | |
| } catch (error) { | |
| recordFailure(endpoint.name, endpoint.url); | |
| let errorMessage = error.message; | |
| if (error.name === "AbortError") { | |
| errorMessage = "timeout"; | |
| } | |
| console.log( | |
| `[${new Date().toISOString()}] ✗ ${ | |
| endpoint.name | |
| } - DOWN (${errorMessage})` | |
| ); | |
| return false; | |
| const controller = new AbortController(); | |
| let timeoutId; | |
| try { | |
| timeoutId = setTimeout(() => controller.abort(), timeout); | |
| const response = await fetch(endpoint.url, { | |
| method: endpoint.method, | |
| signal: controller.signal, | |
| headers: { | |
| "User-Agent": "uptime-service/1.0" | |
| } | |
| }); | |
| clearTimeout(timeoutId); | |
| const endTime = Date.now(); | |
| const responseTime = (endTime - startTime) / 1000; // Convert to seconds | |
| // Check if status code matches expected | |
| if (response.status === endpoint.expectedStatus) { | |
| recordSuccess(endpoint.name, endpoint.url, responseTime); | |
| console.log( | |
| `[${new Date().toISOString()}] ✓ ${endpoint.name} - UP (${ | |
| response.status | |
| }, ${responseTime.toFixed(3)}s)` | |
| ); | |
| return true; | |
| } else { | |
| recordFailure(endpoint.name, endpoint.url); | |
| console.log( | |
| `[${new Date().toISOString()}] ✗ ${endpoint.name} - DOWN (status: ${ | |
| response.status | |
| }, expected: ${endpoint.expectedStatus})` | |
| ); | |
| return false; | |
| } | |
| } catch (error) { | |
| recordFailure(endpoint.name, endpoint.url); | |
| let errorMessage = error.message; | |
| if (error.name === "AbortError") { | |
| errorMessage = "timeout"; | |
| } | |
| console.log( | |
| `[${new Date().toISOString()}] ✗ ${ | |
| endpoint.name | |
| } - DOWN (${errorMessage})` | |
| ); | |
| return false; | |
| } finally { | |
| if (timeoutId) { | |
| clearTimeout(timeoutId); | |
| } | |
| } |
🤖 Prompt for AI Agents
In ts/uptime-service/src/prober.js around lines 13 to 61, the AbortController
and timeout handle are created inside the try so the timeout may still fire
after the fetch has completed; hoist the controller and timeoutId declarations
above the try, assign the timeout inside the try before calling fetch, and add a
finally block that checks if timeoutId is set and calls clearTimeout(timeoutId)
(and optionally controller.abort() only when needed) so every execution path
clears the timer and avoids lingering timeouts.
Summary by CodeRabbit
Release Notes
New Features
Chores