Skip to content

Commit eb60006

Browse files
committed
docs: [#224] add issue specification for HTTP tracker health check fix
- Documents the problem: wrong endpoint URL and warnings instead of errors - Specifies correct endpoint: /health_check (not /api/health_check) - Includes implementation plan and acceptance criteria - Related to parent epic #216 and task #220
1 parent 1b540fd commit eb60006

File tree

5 files changed

+232
-48
lines changed

5 files changed

+232
-48
lines changed

docs/console-commands.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -632,9 +632,9 @@ torrust-tracker-deployer run <environment>
632632
2. **Tracker API Health** (required) - Tests external accessibility of HTTP API
633633
- Endpoint: `http://<vm-ip>:1212/api/health_check`
634634
- Validates service functionality AND firewall configuration
635-
3. **HTTP Tracker Health** (optional) - Tests external accessibility of HTTP tracker
636-
- Endpoint: `http://<vm-ip>:7070/api/health_check`
637-
- Warning only if check fails (not all versions have endpoint)
635+
3. **HTTP Tracker Health** (required) - Tests external accessibility of HTTP tracker
636+
- Endpoint: `http://<vm-ip>:7070/health_check`
637+
- Validates HTTP tracker is accessible externally
638638

639639
**Use Cases**:
640640

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# Fix HTTP Tracker Health Check Endpoint
2+
3+
**Issue**: #224
4+
**Parent Epic**: #216 - Release and Run Commands
5+
**Related**: #220 - Tracker Slice Release and Run Commands
6+
7+
## Overview
8+
9+
The HTTP tracker health check validation is using the wrong endpoint URL (`/api/health_check`) and producing warnings instead of errors when validation fails. The correct endpoint is `/health_check` (without the `/api` prefix), and validation failures should be treated as errors to ensure proper deployment validation.
10+
11+
## Goals
12+
13+
- [x] Fix HTTP tracker health check endpoint URL from `/api/health_check` to `/health_check`
14+
- [x] Convert HTTP tracker validation warnings to errors (make it a required check)
15+
- [x] Include the attempted URL in error messages for better debugging
16+
- [x] Update documentation to reflect the correct endpoint and validation behavior
17+
18+
## 🏗️ Architecture Requirements
19+
20+
**DDD Layer**: Infrastructure
21+
**Module Path**: `src/infrastructure/external_validators/`
22+
**Pattern**: External Validator (Remote Action)
23+
24+
### Module Structure Requirements
25+
26+
- [x] Follow DDD layer separation (see [docs/codebase-architecture.md](../docs/codebase-architecture.md))
27+
- [x] Respect dependency flow rules (infrastructure can depend on domain)
28+
- [x] Use appropriate module organization (see [docs/contributing/module-organization.md](../docs/contributing/module-organization.md))
29+
30+
### Architectural Constraints
31+
32+
- [x] External validators run from test runner/deployment machine (not via SSH)
33+
- [x] Error handling follows project conventions (see [docs/contributing/error-handling.md](../docs/contributing/error-handling.md))
34+
- [x] Validation failures must be actionable with clear troubleshooting steps
35+
36+
### Anti-Patterns to Avoid
37+
38+
- ❌ Using warnings for required validation checks
39+
- ❌ Generic error messages without context (e.g., missing attempted URL)
40+
- ❌ Incorrect endpoint URLs that don't match actual service endpoints
41+
42+
## Specifications
43+
44+
### Current Issue
45+
46+
The HTTP tracker health check in `src/infrastructure/external_validators/running_services.rs` has two problems:
47+
48+
1. **Wrong URL**: Uses `/api/health_check` but should use `/health_check`
49+
2. **Warning vs Error**: Logs warnings on failure instead of returning errors
50+
51+
```rust
52+
// Current (incorrect) implementation
53+
let url = format!("http://{server_ip}:{port}/api/health_check");
54+
// ... logs warning on failure, doesn't propagate error
55+
```
56+
57+
### Expected Behavior
58+
59+
```rust
60+
// Correct implementation
61+
let url = format!("http://{server_ip}:{port}/health_check");
62+
let response = reqwest::get(&url).await.map_err(|e| {
63+
RemoteActionError::ValidationFailed {
64+
action_name: self.name().to_string(),
65+
message: format!(
66+
"HTTP Tracker external health check failed for URL '{url}': {e}. \n\
67+
Check that HTTP tracker is running and firewall allows port {port}."
68+
),
69+
}
70+
})?;
71+
72+
if !response.status().is_success() {
73+
return Err(RemoteActionError::ValidationFailed {
74+
action_name: self.name().to_string(),
75+
message: format!(
76+
"HTTP Tracker returned HTTP {} for URL '{url}': {}. Service may not be healthy.",
77+
response.status(),
78+
response.status().canonical_reason().unwrap_or("Unknown")
79+
),
80+
});
81+
}
82+
```
83+
84+
### Affected Files
85+
86+
1. **`src/infrastructure/external_validators/running_services.rs`**
87+
88+
- Fix URL in `check_http_tracker_external` method
89+
- Change return type from `()` to `Result<(), RemoteActionError>`
90+
- Convert warning logs to error returns with URL context
91+
92+
2. **`docs/user-guide/commands/run.md`**
93+
94+
- Update HTTP tracker health check endpoint documentation
95+
- Change status from "optional" to "required"
96+
97+
3. **`docs/console-commands.md`**
98+
99+
- Update health check endpoint documentation
100+
101+
4. **`src/application/command_handlers/test/handler.rs`**
102+
- Update documentation comments to reflect correct endpoint
103+
104+
## Implementation Plan
105+
106+
### Phase 1: Fix HTTP Tracker Health Check Method (15 minutes)
107+
108+
- [x] Update `check_http_tracker_external` method URL to use `/health_check`
109+
- [x] Change return type from `()` to `Result<(), RemoteActionError>`
110+
- [x] Add URL to error messages for debugging
111+
- [x] Update `validate_external_accessibility` to propagate errors with `?` operator
112+
113+
### Phase 2: Update Documentation (10 minutes)
114+
115+
- [x] Update module documentation header to reflect correct endpoint
116+
- [x] Update `docs/user-guide/commands/run.md` endpoint URL
117+
- [x] Update `docs/console-commands.md` health check description
118+
- [x] Update `src/application/command_handlers/test/handler.rs` comments
119+
120+
### Phase 3: Testing and Verification (10 minutes)
121+
122+
- [x] Run pre-commit checks: `./scripts/pre-commit.sh`
123+
- [x] Verify linting passes (especially clippy and rustfmt)
124+
- [ ] Run manual E2E test to verify HTTP tracker health check works
125+
- [ ] Confirm error messages display the attempted URL
126+
127+
## Acceptance Criteria
128+
129+
> **Note for Contributors**: These criteria define what the PR reviewer will check. Use this as your pre-review checklist before submitting the PR to minimize back-and-forth iterations.
130+
131+
**Quality Checks**:
132+
133+
- [x] Pre-commit checks pass: `./scripts/pre-commit.sh`
134+
135+
**Task-Specific Criteria**:
136+
137+
- [x] HTTP tracker health check uses `/health_check` endpoint (not `/api/health_check`)
138+
- [x] Validation failures return errors instead of logging warnings
139+
- [x] Error messages include the attempted URL for debugging
140+
- [x] `check_http_tracker_external` returns `Result<(), RemoteActionError>`
141+
- [x] `validate_external_accessibility` properly propagates HTTP tracker errors
142+
- [x] Documentation accurately reflects the correct endpoint URL
143+
- [x] Documentation describes HTTP tracker check as "required" not "optional"
144+
- [ ] Manual E2E test confirms health check works with deployed tracker
145+
146+
## Related Documentation
147+
148+
- [docs/codebase-architecture.md](../codebase-architecture.md) - Project architecture
149+
- [docs/contributing/error-handling.md](../contributing/error-handling.md) - Error handling conventions
150+
- [docs/user-guide/commands/run.md](../user-guide/commands/run.md) - Run command documentation
151+
- [src/infrastructure/external_validators/running_services.rs](../../src/infrastructure/external_validators/running_services.rs) - Implementation file
152+
153+
## Notes
154+
155+
### Why This Fix Is Important
156+
157+
1. **Correct Endpoint**: The Torrust Tracker HTTP tracker exposes health checks at `/health_check`, not `/api/health_check`. Using the wrong endpoint causes all validations to fail with 404 errors.
158+
159+
2. **Error Visibility**: Converting warnings to errors ensures that deployment failures are properly reported and CI/CD pipelines can detect issues. Warnings can be easily missed in logs.
160+
161+
3. **Debugging Support**: Including the attempted URL in error messages helps developers quickly identify configuration issues or endpoint mismatches.
162+
163+
### Verification Steps
164+
165+
After implementation, verify with:
166+
167+
```bash
168+
# Run pre-commit checks
169+
./scripts/pre-commit.sh
170+
171+
# Deploy tracker and test health check
172+
cargo run -- create e2e-test --env-file envs/e2e-test.json
173+
cargo run -- provision e2e-test
174+
cargo run -- configure e2e-test
175+
cargo run -- release e2e-test
176+
cargo run -- run e2e-test # Should succeed without warnings
177+
178+
# Manual health check test
179+
INSTANCE_IP=$(cat data/e2e-test/environment.json | jq -r '.Running.context.runtime_outputs.instance_ip')
180+
curl http://$INSTANCE_IP:7070/health_check # Should return success
181+
```

docs/user-guide/commands/run.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -261,9 +261,9 @@ The `run` command performs external health checks to validate deployment:
261261
- Validates both service functionality AND firewall rules
262262

263263
3. **HTTP Tracker Health Checks** (external, direct HTTP)
264-
- Tests `http://<vm-ip>:<port>/api/health_check` for **all configured HTTP trackers**
265-
- **Optional checks** - logs warnings if not accessible, but doesn't fail deployment
266-
- Some tracker versions may not have health endpoints
264+
265+
- Tests `http://<vm-ip>:<port>/health_check` for **all configured HTTP trackers**
266+
- **Required checks** - deployment fails if not accessible
267267
- If you configure multiple HTTP trackers (e.g., ports 7070, 7071, 7072), all will be validated
268268

269269
If external checks fail but Docker shows services running, it indicates a firewall or network configuration issue.

src/application/command_handlers/test/handler.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
//! 1. **Docker Compose Service Status** - Verifies containers are running
1414
//! 2. **External Health Checks** - Tests service accessibility from outside the VM:
1515
//! - Tracker API health endpoint (required): `http://<vm-ip>:<api-port>/api/health_check`
16-
//! - HTTP Tracker health endpoint (optional): `http://<vm-ip>:<tracker-port>/api/health_check`
16+
//! - HTTP Tracker health endpoint (required): `http://<vm-ip>:<tracker-port>/health_check`
1717
//!
1818
//! ## Why External-Only Validation?
1919
//!

src/infrastructure/external_validators/running_services.rs

Lines changed: 44 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
//! This validator performs external validation only (from test runner to VM):
2626
//! - Verifies Docker Compose services are running (via SSH: `docker compose ps`)
2727
//! - Tests tracker API health endpoint from outside: `http://<vm-ip>:1212/api/health_check`
28-
//! - Tests HTTP tracker health endpoint from outside: `http://<vm-ip>:7070/api/health_check`
28+
//! - Tests HTTP tracker health endpoint from outside: `http://<vm-ip>:7070/health_check`
2929
//!
3030
//! **Validation Philosophy**: External checks are a superset of internal checks.
3131
//! If external validation passes, it proves:
@@ -88,7 +88,7 @@
8888
//! 2. Check that containers are in "running" status (not "exited" or "restarting")
8989
//! 3. Verify health check status if configured (e.g., "healthy")
9090
//! 4. Test tracker API from outside: HTTP GET to `http://<vm-ip>:1212/api/health_check`
91-
//! 5. Test HTTP tracker from outside: HTTP GET to `http://<vm-ip>:7070/api/health_check`
91+
//! 5. Test HTTP tracker from outside: HTTP GET to `http://<vm-ip>:7070/health_check`
9292
//!
9393
//! This ensures end-to-end validation:
9494
//! - Services are deployed and running
@@ -97,7 +97,7 @@
9797
9898
use std::net::IpAddr;
9999
use std::path::PathBuf;
100-
use tracing::{info, instrument, warn};
100+
use tracing::{info, instrument};
101101

102102
use crate::adapters::ssh::SshConfig;
103103
use crate::infrastructure::remote_actions::{RemoteAction, RemoteActionError};
@@ -172,9 +172,9 @@ impl RunningServicesValidator {
172172
self.check_tracker_api_external(server_ip, tracker_api_port)
173173
.await?;
174174

175-
// Check all HTTP trackers
175+
// Check all HTTP trackers (required)
176176
for port in http_tracker_ports {
177-
self.check_http_tracker_external(server_ip, *port).await;
177+
self.check_http_tracker_external(server_ip, *port).await?;
178178
}
179179

180180
Ok(())
@@ -233,12 +233,16 @@ impl RunningServicesValidator {
233233
Ok(())
234234
}
235235

236-
/// Check HTTP tracker accessibility from outside the VM (optional check)
236+
/// Check HTTP tracker accessibility from outside the VM
237237
///
238238
/// # Arguments
239239
/// * `server_ip` - IP address of the server
240240
/// * `port` - Port for the HTTP tracker health endpoint
241-
async fn check_http_tracker_external(&self, server_ip: &IpAddr, port: u16) {
241+
async fn check_http_tracker_external(
242+
&self,
243+
server_ip: &IpAddr,
244+
port: u16,
245+
) -> Result<(), RemoteActionError> {
242246
info!(
243247
action = "running_services_validation",
244248
check = "http_tracker_external",
@@ -247,41 +251,40 @@ impl RunningServicesValidator {
247251
"Checking HTTP tracker health endpoint (external from test runner)"
248252
);
249253

250-
let url = format!("http://{server_ip}:{port}/api/health_check"); // DevSkim: ignore DS137138
251-
match reqwest::get(&url).await {
252-
Ok(response) if response.status().is_success() => {
253-
info!(
254-
action = "running_services_validation",
255-
check = "http_tracker_external",
256-
port = port,
257-
status = "success",
258-
validation_type = "external",
259-
"HTTP Tracker is accessible from outside (external check passed)"
260-
);
261-
}
262-
Ok(response) => {
263-
warn!(
264-
action = "running_services_validation",
265-
check = "http_tracker_external",
266-
port = port,
267-
status = "warning",
268-
validation_type = "external",
269-
http_status = %response.status(),
270-
"HTTP Tracker returned non-success status - may not have health endpoint"
271-
);
272-
}
273-
Err(e) => {
274-
warn!(
275-
action = "running_services_validation",
276-
check = "http_tracker_external",
277-
port = port,
278-
status = "warning",
279-
validation_type = "external",
280-
error = %e,
281-
"HTTP Tracker health check failed - may not have health endpoint or still starting"
282-
);
283-
}
254+
let url = format!("http://{server_ip}:{port}/health_check"); // DevSkim: ignore DS137138
255+
let response =
256+
reqwest::get(&url)
257+
.await
258+
.map_err(|e| RemoteActionError::ValidationFailed {
259+
action_name: self.name().to_string(),
260+
message: format!(
261+
"HTTP Tracker external health check failed for URL '{url}': {e}. \n\
262+
Check that HTTP tracker is running and firewall allows port {port}."
263+
),
264+
})?;
265+
266+
if !response.status().is_success() {
267+
return Err(RemoteActionError::ValidationFailed {
268+
action_name: self.name().to_string(),
269+
message: format!(
270+
"HTTP Tracker returned HTTP {} for URL '{url}': {}. Service may not be healthy.",
271+
response.status(),
272+
response.status().canonical_reason().unwrap_or("Unknown")
273+
),
274+
});
284275
}
276+
277+
info!(
278+
action = "running_services_validation",
279+
check = "http_tracker_external",
280+
port = port,
281+
status = "success",
282+
validation_type = "external",
283+
url = %url,
284+
"HTTP Tracker is accessible from outside (external check passed)"
285+
);
286+
287+
Ok(())
285288
}
286289
}
287290

0 commit comments

Comments
 (0)