Skip to content

Commit 90152ec

Browse files
committed
docs: add comprehensive error handling guide with thiserror best practices
- Create new docs/contributing/error-handling.md with detailed principles: * Clarity: Clear, unambiguous error messages * Context & Traceability: Sufficient context + source error preservation * Actionability: Tell users exactly how to fix problems * Thiserror usage: Prefer #[error] attributes + #[source] for error chains * Enum vs anyhow guidance: Use enums by default, anyhow only when justified - Link error guide from docs/development-principles.md - Add new Essential Rule #5 to .github/copilot-instructions.md - Update project-words.txt with new terminology This establishes comprehensive error handling standards aligned with the project's observability and actionability principles.
1 parent c3416ed commit 90152ec

File tree

4 files changed

+341
-0
lines changed

4 files changed

+341
-0
lines changed

.github/copilot-instructions.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,8 @@ These principles should guide all development decisions, code reviews, and featu
7979

8080
4. **Before working with Tera templates**: Read [`docs/contributing/templates.md`](../docs/contributing/templates.md) for correct variable syntax - use `{{ variable }}` not `{ { variable } }`. Tera template files have the `.tera` extension.
8181

82+
5. **When handling errors in code**: Read [`docs/contributing/error-handling.md`](../docs/contributing/error-handling.md) for error handling principles. Prefer explicit enum errors over anyhow for better pattern matching and user experience. Make errors clear, include sufficient context for traceability, and ensure they are actionable with specific fix instructions.
83+
8284
## 🧪 Build & Test
8385

8486
- **Lint**: `cargo run --bin linter all` (comprehensive - tests stable & nightly toolchains)
Lines changed: 335 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,335 @@
1+
# Error Handling Guide
2+
3+
This guide establishes principles and best practices for error handling in the Torrust Tracker Deploy application, aligning with our [development principles](../development-principles.md) of observability, traceability, and actionability.
4+
5+
## 🎯 Core Principles
6+
7+
### 1. Clarity - No Ambiguity
8+
9+
Errors must be clear and unambiguous. Users should immediately understand what went wrong without needing to guess or interpret vague messages.
10+
11+
#### ✅ Good Examples
12+
13+
```rust
14+
// Clear, specific error
15+
pub enum ConfigError {
16+
FileNotFound { path: PathBuf },
17+
InvalidFormat { line: usize, reason: String },
18+
MissingRequiredField { field: String },
19+
}
20+
```
21+
22+
#### ❌ Bad Examples
23+
24+
```rust
25+
// Vague, unclear error
26+
return Err("Something went wrong".into());
27+
return Err("Invalid input".into());
28+
return Err("Error".into());
29+
```
30+
31+
### 2. Context and Traceability
32+
33+
Errors should include sufficient context to make them easy to diagnose and fix. This aligns with our **Observability** and **Traceability** principles.
34+
35+
#### Context Requirements
36+
37+
- **What**: What operation was being performed?
38+
- **Where**: Which component, file, or resource was involved?
39+
- **When**: Under what conditions did this occur?
40+
- **Why**: What caused the error?
41+
42+
#### ✅ Good Examples
43+
44+
```rust
45+
pub enum ProvisioningError {
46+
InstanceAlreadyExists {
47+
instance_name: String,
48+
provider: String
49+
},
50+
InvalidConfiguration {
51+
config_path: PathBuf,
52+
validation_errors: Vec<String>
53+
},
54+
NetworkTimeout {
55+
operation: String,
56+
timeout_duration: Duration,
57+
endpoint: String
58+
},
59+
}
60+
61+
impl Display for ProvisioningError {
62+
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
63+
match self {
64+
Self::InstanceAlreadyExists { instance_name, provider } => {
65+
write!(f, "Instance '{}' already exists in {} provider. Use a different name or remove the existing instance.", instance_name, provider)
66+
},
67+
Self::InvalidConfiguration { config_path, validation_errors } => {
68+
write!(f, "Configuration file '{}' is invalid:\n{}",
69+
config_path.display(),
70+
validation_errors.join("\n"))
71+
},
72+
Self::NetworkTimeout { operation, timeout_duration, endpoint } => {
73+
write!(f, "Network timeout during '{}' operation to '{}' after {:?}. Check network connectivity and endpoint availability.",
74+
operation, endpoint, timeout_duration)
75+
},
76+
}
77+
}
78+
}
79+
```
80+
81+
### 3. Actionability
82+
83+
Errors should be actionable, telling users how to fix them when possible. This aligns with our **Actionability** principle.
84+
85+
#### Requirements
86+
87+
- **Clear Instructions**: Provide specific steps to resolve the issue
88+
- **Command Examples**: Include exact commands when applicable
89+
- **Alternative Solutions**: Offer multiple approaches when possible
90+
- **Next Steps**: Guide users on what to do next
91+
92+
#### ✅ Good Examples
93+
94+
```rust
95+
pub enum DeploymentError {
96+
SshKeyNotFound {
97+
expected_path: PathBuf,
98+
alternative_paths: Vec<PathBuf>
99+
},
100+
InsufficientPermissions {
101+
required_permissions: String,
102+
current_user: String
103+
},
104+
}
105+
106+
impl Display for DeploymentError {
107+
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
108+
match self {
109+
Self::SshKeyNotFound { expected_path, alternative_paths } => {
110+
write!(f, "SSH key not found at '{}'. \n\nTo fix this:\n1. Generate a new SSH key: ssh-keygen -t rsa -b 4096 -f '{}'\n2. Or specify an existing key path using --ssh-key-path\n3. Alternative locations checked: {}",
111+
expected_path.display(),
112+
expected_path.display(),
113+
alternative_paths.iter().map(|p| format!("'{}'", p.display())).collect::<Vec<_>>().join(", "))
114+
},
115+
Self::InsufficientPermissions { required_permissions, current_user } => {
116+
write!(f, "User '{}' lacks required permissions: {}\n\nTo fix this:\n1. Add your user to the required group: sudo usermod -aG lxd {}\n2. Log out and log back in to apply group changes\n3. Or run with sudo (not recommended for regular use)",
117+
current_user, required_permissions, current_user)
118+
},
119+
}
120+
}
121+
}
122+
```
123+
124+
## 🛠️ Implementation Guidelines
125+
126+
### Prefer Explicit Enum Errors with Thiserror
127+
128+
Use explicit, strongly-typed enum errors with the `thiserror` crate instead of generic string-based errors for better pattern matching, automatic `Display` implementation, and error handling.
129+
130+
#### Using Thiserror for Error Definitions
131+
132+
The `thiserror` crate provides powerful macros for defining structured errors:
133+
134+
```rust
135+
use thiserror::Error;
136+
137+
#[derive(Debug, Error)]
138+
pub enum TemplateManagerError {
139+
#[error("Failed to create templates directory: {path}")]
140+
DirectoryCreation {
141+
path: String,
142+
#[source]
143+
source: std::io::Error,
144+
},
145+
146+
#[error("Template file not found in embedded resources: {relative_path}")]
147+
TemplateNotFound { relative_path: String },
148+
149+
#[error("Invalid UTF-8 in embedded template: {relative_path}")]
150+
InvalidUtf8 {
151+
relative_path: String,
152+
#[source]
153+
source: std::str::Utf8Error,
154+
},
155+
156+
#[error("Failed to write template file: {path}")]
157+
TemplateWrite {
158+
path: String,
159+
#[source]
160+
source: std::io::Error,
161+
},
162+
}
163+
```
164+
165+
#### Key Benefits of Thiserror
166+
167+
- **Automatic `Display`**: The `#[error("...")]` attribute generates the `Display` implementation
168+
- **Source Error Chaining**: The `#[source]` attribute maintains error chains for traceability
169+
- **Structured Data**: Variant fields provide context and enable pattern matching
170+
- **Better Debugging**: Automatic `Error` trait implementation with proper source chaining
171+
172+
#### When to Use Enum Errors
173+
174+
- **Recoverable errors**: When callers can take specific actions based on error type
175+
- **Domain-specific errors**: When errors are specific to your application domain
176+
- **Pattern matching**: When you need different handling for different error cases
177+
- **API boundaries**: When errors cross module or crate boundaries
178+
179+
```rust
180+
// ✅ Preferred: Explicit enum with context using thiserror
181+
#[derive(Debug, Error)]
182+
pub enum ConfigValidationError {
183+
#[error("Missing required field '{field}' in section '{section}'")]
184+
MissingField { field: String, section: String },
185+
186+
#[error("Field '{field}' has invalid value '{value}', expected: {expected}")]
187+
InvalidValue { field: String, value: String, expected: String },
188+
189+
#[error("Cannot access configuration file: {path}")]
190+
FileAccessError {
191+
path: PathBuf,
192+
#[source]
193+
source: std::io::Error
194+
},
195+
}
196+
197+
// Allows for precise error handling
198+
match config_result {
199+
Err(ConfigValidationError::MissingField { field, section }) => {
200+
println!("Please add '{}' to the '{}' section", field, section);
201+
},
202+
Err(ConfigValidationError::InvalidValue { field, value, expected }) => {
203+
println!("Field '{}' has invalid value '{}', expected: {}", field, value, expected);
204+
},
205+
// ... handle other cases
206+
}
207+
```
208+
209+
### Source Error Preservation for Traceability
210+
211+
Always include source errors when wrapping underlying errors. This maintains the full error chain and enables complete traceability.
212+
213+
#### ✅ Good: Preserve source errors
214+
215+
```rust
216+
#[derive(Debug, Error)]
217+
pub enum DeploymentError {
218+
#[error("Failed to read SSH key from {path}")]
219+
SshKeyRead {
220+
path: PathBuf,
221+
#[source] // Preserves the original I/O error
222+
source: std::io::Error,
223+
},
224+
225+
#[error("Network operation '{operation}' failed")]
226+
NetworkError {
227+
operation: String,
228+
#[source] // Preserves the original network error
229+
source: reqwest::Error,
230+
},
231+
}
232+
233+
// Usage with source preservation
234+
fn read_ssh_key(path: &Path) -> Result<String, DeploymentError> {
235+
std::fs::read_to_string(path)
236+
.map_err(|source| DeploymentError::SshKeyRead {
237+
path: path.to_path_buf(),
238+
source, // Original error is preserved
239+
})
240+
}
241+
```
242+
243+
#### ❌ Bad: Losing source information
244+
245+
```rust
246+
// Don't do this - loses original error information
247+
fn read_ssh_key(path: &Path) -> Result<String, DeploymentError> {
248+
std::fs::read_to_string(path)
249+
.map_err(|e| DeploymentError::SshKeyRead {
250+
path: path.to_path_buf(),
251+
// Missing source - loses traceability!
252+
})
253+
}
254+
```
255+
256+
### When to Use Anyhow
257+
258+
Use `anyhow` only when the caller cannot do anything meaningful to handle different error types, even with pattern matching.
259+
260+
#### Appropriate Use Cases
261+
262+
- **Utility functions**: Internal helpers where specific error handling isn't needed
263+
- **One-way operations**: When all errors should bubble up unchanged
264+
- **Rapid prototyping**: Early development phases (but migrate to enums later)
265+
- **External library integration**: When wrapping third-party errors temporarily
266+
267+
```rust
268+
// ✅ Acceptable: Internal utility where caller can't handle specifics
269+
fn read_and_parse_internal_cache() -> anyhow::Result<CacheData> {
270+
let content = std::fs::read_to_string("cache.json")?;
271+
let data = serde_json::from_str(&content)?;
272+
Ok(data)
273+
}
274+
275+
// ✅ Public API should use enums
276+
pub fn load_user_config(path: &Path) -> Result<Config, ConfigError> {
277+
let content = std::fs::read_to_string(path)
278+
.map_err(|e| ConfigError::FileAccess { path: path.to_path_buf(), source: e })?;
279+
280+
serde_json::from_str(&content)
281+
.map_err(|e| ConfigError::InvalidJson { path: path.to_path_buf(), source: e })
282+
}
283+
```
284+
285+
### Error Conversion Patterns
286+
287+
```rust
288+
// Convert anyhow errors to domain errors at boundaries
289+
impl From<anyhow::Error> for DeploymentError {
290+
fn from(err: anyhow::Error) -> Self {
291+
DeploymentError::InternalError {
292+
message: err.to_string(),
293+
context: format!("{:?}", err.chain().collect::<Vec<_>>())
294+
}
295+
}
296+
}
297+
```
298+
299+
## 📋 Error Review Checklist
300+
301+
When reviewing error handling code, verify:
302+
303+
- [ ] **Clarity**: Is the error message clear and unambiguous?
304+
- [ ] **Context**: Does the error include sufficient context (what, where, when, why)?
305+
- [ ] **Actionability**: Does the error tell users how to fix it?
306+
- [ ] **Type Safety**: Are domain-specific errors using enums instead of strings?
307+
- [ ] **Thiserror Usage**: Are enum errors using `thiserror` with proper `#[error]` attributes?
308+
- [ ] **Source Preservation**: Are source errors preserved with `#[source]` for traceability?
309+
- [ ] **Pattern Matching**: Can callers handle different error cases appropriately?
310+
- [ ] **Consistency**: Does the error follow project conventions?
311+
312+
## 🔗 Related Documentation
313+
314+
- [Development Principles](../development-principles.md) - Core principles including observability and actionability
315+
- [Contributing Guidelines](./README.md) - General contribution guidelines
316+
- [Testing Conventions](./testing.md) - Testing error scenarios
317+
318+
## 📚 Examples in Codebase
319+
320+
Look for error handling examples in:
321+
322+
- `src/domain/` - Domain-specific error enums
323+
- `src/infrastructure/` - Infrastructure and I/O error handling
324+
- `src/application/commands/` - Command-level error aggregation
325+
326+
## 🚀 Best Practices Summary
327+
328+
1. **Design errors first**: Consider error cases during API design
329+
2. **Use enums by default**: Only use `anyhow` when justified
330+
3. **Include context**: Always provide enough information for diagnosis
331+
4. **Make errors actionable**: Tell users how to fix the problem
332+
5. **Test error paths**: Write tests for error scenarios
333+
6. **Document error types**: Document when and why specific errors occur
334+
335+
By following these guidelines, we ensure that errors in the Torrust Tracker Deploy application are not just informative, but truly helpful in guiding users toward solutions.

docs/development-principles.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,8 @@ When users encounter problems or edge cases, the system must provide clear guida
9191
3. **Error Message Review**: Have error messages reviewed by non-technical users
9292
4. **Documentation First**: Document edge cases and their handling
9393

94+
For detailed guidance on implementing these principles in error handling, see the [Error Handling Guide](./contributing/error-handling.md).
95+
9496
### For Code Reviews
9597

9698
- Verify that new code includes appropriate logging

project-words.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ cloudinit
88
connrefused
99
containerd
1010
cpus
11+
Crossplane
1112
dearmor
1213
debootstrap
1314
distutils
@@ -56,6 +57,7 @@ pipefail
5657
prereq
5758
println
5859
publickey
60+
Pulumi
5961
pytest
6062
RAII
6163
Repomix

0 commit comments

Comments
 (0)