Skip to content

Commit 734772e

Browse files
CopilotGrantBirki
andcommitted
Implement production readiness and performance improvements
Co-authored-by: GrantBirki <[email protected]>
1 parent 49601be commit 734772e

File tree

6 files changed

+520
-26
lines changed

6 files changed

+520
-26
lines changed

docs/reliability.md

Lines changed: 262 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,262 @@
1+
# Production Reliability and Performance Guide
2+
3+
This document outlines the reliability, performance, and security considerations for running the Hooks webhook server framework in production environments.
4+
5+
## 🔍 Security Considerations
6+
7+
### Dynamic Plugin Loading Security
8+
9+
The framework includes comprehensive security measures for dynamic plugin loading:
10+
11+
- **Class Name Validation**: All plugin class names are validated against safe patterns (`/\A[A-Z][a-zA-Z0-9_]*\z/`)
12+
- **Dangerous Class Blacklist**: System classes like `File`, `Dir`, `Kernel`, `Object`, `Process`, etc. are blocked from being loaded as plugins
13+
- **Path Traversal Protection**: Plugin file paths are normalized and validated to prevent loading files outside designated directories
14+
- **Safe Constant Resolution**: Uses `Object.const_get` only after thorough validation
15+
16+
### Request Processing Security
17+
18+
- **Request Size Limits**: Configurable request body size limits (default enforcement via `request_limit` config)
19+
- **JSON Parsing Protection**: JSON parsing includes security limits to prevent JSON bombs:
20+
- Maximum nesting depth (configurable via `JSON_MAX_NESTING`, default: 20)
21+
- Maximum payload size before parsing (configurable via `JSON_MAX_SIZE`, default: 10MB)
22+
- Disabled object creation from JSON (`create_additions: false`)
23+
- Uses plain Hash/Array classes to prevent object injection
24+
- **Header Validation**: Multiple header format handling with safe fallbacks and optimized lookup order
25+
26+
## ⚡ Performance Optimizations
27+
28+
### Startup Performance
29+
30+
The framework uses several strategies to optimize startup time:
31+
32+
- **Explicit Module Loading**: Core modules are loaded explicitly rather than using `Dir.glob` patterns for better performance and security
33+
- **Boot-time Plugin Loading**: All plugins are loaded once at startup rather than per-request
34+
- **Plugin Caching**: Loaded plugins are cached in class-level registries for fast access
35+
- **Sorted Directory Loading**: Plugin directories are processed in sorted order for consistent behavior
36+
37+
### Runtime Performance
38+
39+
- **Per-request Optimizations**:
40+
- Plugin instances are reused across requests
41+
- Request contexts use thread-local storage for efficient access
42+
- Handler instances are created per-request but classes are cached
43+
- Optimized header processing with common cases checked first
44+
45+
- **Memory Management**:
46+
- Plugin registries use hash-based lookups for O(1) access
47+
- Thread-local contexts are properly cleaned up after requests
48+
- Clear plugin loading separates concerns efficiently
49+
50+
- **Security Limits**:
51+
- Retry configuration includes bounds checking to prevent resource exhaustion
52+
- JSON parsing has built-in limits to prevent JSON bombs and memory attacks
53+
54+
### Recommended Production Configuration
55+
56+
```yaml
57+
# Example production configuration
58+
log_level: "info" # Reduces debug overhead
59+
request_limit: 1048576 # 1MB limit (adjust based on needs)
60+
request_timeout: 30 # 30 second timeout
61+
environment: "production" # Disables debug features like backtraces
62+
normalize_headers: true # Consistent header processing
63+
symbolize_payload: false # Reduced memory usage for large payloads
64+
```
65+
66+
### Security Environment Variables
67+
68+
Additional security can be configured via environment variables:
69+
70+
```bash
71+
# JSON Security Limits
72+
JSON_MAX_NESTING=20 # Maximum JSON nesting depth (default: 20)
73+
JSON_MAX_SIZE=10485760 # Maximum JSON size before parsing (default: 10MB)
74+
75+
# Retry Safety Limits
76+
DEFAULT_RETRY_SLEEP=1 # Sleep between retries 0-300 seconds (default: 1)
77+
DEFAULT_RETRY_TRIES=10 # Number of retry attempts 1-50 (default: 10)
78+
RETRY_LOG_RETRIES=false # Disable retry logging in production (default: true)
79+
```
80+
81+
## 🔧 Monitoring and Observability
82+
83+
### Health Check Endpoint
84+
85+
The built-in health endpoint (`/health`) provides comprehensive status information:
86+
87+
```json
88+
{
89+
"status": "healthy",
90+
"timestamp": "2025-01-01T12:00:00Z",
91+
"version": "1.0.0",
92+
"uptime_seconds": 3600,
93+
"config_checksum": "abc123",
94+
"endpoints_loaded": 5,
95+
"plugins_loaded": 3
96+
}
97+
```
98+
99+
### Lifecycle Hooks for Monitoring
100+
101+
Use lifecycle plugins to add comprehensive monitoring:
102+
103+
- **Request Metrics**: Track request counts, timing, and error rates
104+
- **Error Reporting**: Capture and report exceptions with full context
105+
- **Resource Monitoring**: Track memory usage, plugin load times, etc.
106+
107+
### Recommended Instrumentation
108+
109+
```ruby
110+
# Example monitoring lifecycle plugin
111+
class MonitoringLifecycle < Hooks::Plugins::Lifecycle
112+
def on_request(env)
113+
stats.increment("webhook.requests", {
114+
handler: env["hooks.handler"],
115+
endpoint: env["PATH_INFO"]
116+
})
117+
end
118+
119+
def on_response(env, response)
120+
processing_time = Time.now - Time.parse(env["hooks.start_time"])
121+
stats.timing("webhook.processing_time", processing_time * 1000, {
122+
handler: env["hooks.handler"]
123+
})
124+
end
125+
126+
def on_error(exception, env)
127+
stats.increment("webhook.errors", {
128+
error_type: exception.class.name,
129+
handler: env["hooks.handler"]
130+
})
131+
132+
failbot.report(exception, {
133+
request_id: env["hooks.request_id"],
134+
handler: env["hooks.handler"],
135+
endpoint: env["PATH_INFO"]
136+
})
137+
end
138+
end
139+
```
140+
141+
## 🚀 Production Deployment Best Practices
142+
143+
### Server Configuration
144+
145+
1. **Use Puma in Cluster Mode** for production:
146+
```ruby
147+
# config/puma.rb
148+
workers ENV.fetch("WEB_CONCURRENCY", 2)
149+
threads_count = ENV.fetch("MAX_THREADS", 5)
150+
threads threads_count, threads_count
151+
preload_app!
152+
```
153+
154+
2. **Configure Resource Limits**:
155+
- Set appropriate worker memory limits
156+
- Configure worker restart thresholds
157+
- Set connection pool sizes appropriately
158+
159+
3. **Environment Variables**:
160+
```bash
161+
# Retry configuration
162+
DEFAULT_RETRY_TRIES=3 # Reduced from default 10
163+
DEFAULT_RETRY_SLEEP=1 # 1 second between retries
164+
RETRY_LOG_RETRIES=false # Reduce log noise in production
165+
166+
# Logging
167+
LOG_LEVEL=info # Reduce debug overhead
168+
```
169+
170+
### Container Considerations
171+
172+
```dockerfile
173+
# Optimized production Dockerfile
174+
FROM ruby:3.2-alpine AS builder
175+
WORKDIR /app
176+
COPY Gemfile* ./
177+
RUN bundle install --deployment --without development test
178+
179+
FROM ruby:3.2-alpine
180+
WORKDIR /app
181+
COPY --from=builder /app/vendor ./vendor
182+
COPY . .
183+
184+
# Security: Run as non-root user
185+
RUN addgroup -g 1001 -S appuser && \
186+
adduser -S appuser -u 1001 -G appuser
187+
USER appuser
188+
189+
# Health check
190+
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
191+
CMD curl -f http://localhost:3000/health || exit 1
192+
193+
EXPOSE 3000
194+
CMD ["bundle", "exec", "puma", "-C", "config/puma.rb"]
195+
```
196+
197+
## 🛡️ Security Hardening
198+
199+
### Input Validation
200+
201+
- **Payload Size Limits**: Always configure `request_limit` appropriate for your use case
202+
- **Timeout Configuration**: Set reasonable `request_timeout` values
203+
- **Content Type Validation**: Implement strict content type checking if needed
204+
205+
### Authentication
206+
207+
- **HMAC Validation**: Always enable authentication for production endpoints
208+
- **Secret Management**: Store webhook secrets in environment variables or secure secret management systems
209+
- **Signature Validation**: Use time-based signature validation to prevent replay attacks
210+
211+
### Network Security
212+
213+
- **TLS Termination**: Always terminate TLS/SSL at load balancer or reverse proxy
214+
- **IP Whitelisting**: Implement IP restrictions at network level when possible
215+
- **Rate Limiting**: Implement rate limiting at reverse proxy/load balancer level
216+
217+
## 📊 Performance Benchmarking
218+
219+
### Load Testing Recommendations
220+
221+
1. **Baseline Testing**: Test with minimal handlers and no lifecycle plugins
222+
2. **Plugin Impact**: Measure performance impact of each lifecycle plugin
223+
3. **Memory Profiling**: Monitor memory usage over extended periods
224+
4. **Concurrency Testing**: Test with realistic concurrent webhook loads
225+
226+
### Key Metrics to Monitor
227+
228+
- **Request Processing Time**: P50, P95, P99 response times
229+
- **Memory Usage**: RSS, heap size, GC frequency
230+
- **Error Rates**: 4xx and 5xx response rates
231+
- **Plugin Performance**: Individual plugin execution times
232+
- **Resource Utilization**: CPU, memory, network I/O
233+
234+
## 🔧 Troubleshooting
235+
236+
### Common Performance Issues
237+
238+
1. **High Memory Usage**:
239+
- Check for plugin memory leaks
240+
- Monitor payload sizes
241+
- Review lifecycle plugin efficiency
242+
243+
2. **Slow Request Processing**:
244+
- Profile individual plugins
245+
- Check JSON parsing performance
246+
- Review handler implementation efficiency
247+
248+
3. **Plugin Loading Issues**:
249+
- Verify plugin directory permissions
250+
- Check plugin class name formatting
251+
- Review security validation errors
252+
253+
### Debug Configuration
254+
255+
For troubleshooting, temporarily enable debug logging:
256+
257+
```yaml
258+
log_level: "debug"
259+
environment: "development" # Enables error backtraces
260+
```
261+
262+
**Important**: Never run production with debug logging enabled long-term due to performance and security implications.

lib/hooks.rb

Lines changed: 27 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,35 @@
33
require_relative "hooks/version"
44
require_relative "hooks/core/builder"
55

6-
# Load all core components
7-
Dir[File.join(__dir__, "hooks/core/**/*.rb")].sort.each do |file|
8-
require file
9-
end
6+
# Load core components explicitly for better performance and security
7+
require_relative "hooks/core/config_loader"
8+
require_relative "hooks/core/config_validator"
9+
require_relative "hooks/core/logger_factory"
10+
require_relative "hooks/core/plugin_loader"
11+
require_relative "hooks/core/global_components"
12+
require_relative "hooks/core/log"
13+
require_relative "hooks/core/failbot"
14+
require_relative "hooks/core/stats"
1015

11-
# Load all plugins (auth plugins, handler plugins, lifecycle hooks, etc.)
12-
Dir[File.join(__dir__, "hooks/plugins/**/*.rb")].sort.each do |file|
13-
require file
14-
end
16+
# Load essential plugins explicitly
17+
require_relative "hooks/plugins/auth/base"
18+
require_relative "hooks/plugins/auth/hmac"
19+
require_relative "hooks/plugins/auth/shared_secret"
20+
require_relative "hooks/plugins/handlers/base"
21+
require_relative "hooks/plugins/handlers/default"
22+
require_relative "hooks/plugins/lifecycle"
23+
require_relative "hooks/plugins/instruments/stats_base"
24+
require_relative "hooks/plugins/instruments/failbot_base"
25+
require_relative "hooks/plugins/instruments/stats"
26+
require_relative "hooks/plugins/instruments/failbot"
1527

16-
# Load all utils
17-
Dir[File.join(__dir__, "hooks/utils/**/*.rb")].sort.each do |file|
18-
require file
19-
end
28+
# Load utils explicitly
29+
require_relative "hooks/utils/normalize"
30+
require_relative "hooks/utils/retry"
31+
32+
# Load security module
33+
require_relative "hooks/security"
34+
require_relative "hooks/version"
2035

2136
# Main module for the Hooks webhook server framework
2237
module Hooks

lib/hooks/app/helpers.rb

Lines changed: 43 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,15 @@ def uuid
2121
# @return [void]
2222
# @note Timeout enforcement should be handled at the server level (e.g., Puma)
2323
def enforce_request_limits(config)
24-
# Check content length (handle different header formats and sources)
25-
content_length = headers["Content-Length"] || headers["CONTENT_LENGTH"] ||
26-
headers["content-length"] || headers["HTTP_CONTENT_LENGTH"] ||
27-
env["CONTENT_LENGTH"] || env["HTTP_CONTENT_LENGTH"]
24+
# Optimized content length check - check most common sources first
25+
content_length = request.content_length if respond_to?(:request) && request.respond_to?(:content_length)
2826

29-
# Also try to get from request object directly
30-
content_length ||= request.content_length if respond_to?(:request) && request.respond_to?(:content_length)
27+
content_length ||= headers["Content-Length"] ||
28+
headers["CONTENT_LENGTH"] ||
29+
headers["content-length"] ||
30+
headers["HTTP_CONTENT_LENGTH"] ||
31+
env["CONTENT_LENGTH"] ||
32+
env["HTTP_CONTENT_LENGTH"]
3133

3234
content_length = content_length&.to_i
3335

@@ -38,23 +40,29 @@ def enforce_request_limits(config)
3840
# Note: Timeout enforcement would typically be handled at the server level (Puma, etc.)
3941
end
4042

41-
# Parse request payload
43+
# Parse request payload with security limits
4244
#
4345
# @param raw_body [String] The raw request body
4446
# @param headers [Hash] The request headers
4547
# @param symbolize [Boolean] Whether to symbolize keys in parsed JSON (default: true)
4648
# @return [Hash, String] Parsed JSON as Hash (optionally symbolized), or raw body if not JSON
4749
def parse_payload(raw_body, headers, symbolize: true)
48-
content_type = headers["Content-Type"] || headers["CONTENT_TYPE"] || headers["content-type"] || headers["HTTP_CONTENT_TYPE"]
50+
# Optimized content type check - check most common header first
51+
content_type = headers["Content-Type"] || headers["CONTENT_TYPE"] || headers["content-type"]
4952

5053
# Try to parse as JSON if content type suggests it or if it looks like JSON
5154
if content_type&.include?("application/json") || (raw_body.strip.start_with?("{", "[") rescue false)
5255
begin
53-
parsed_payload = JSON.parse(raw_body)
56+
# Security: Limit JSON parsing depth and complexity to prevent JSON bombs
57+
parsed_payload = safe_json_parse(raw_body)
5458
parsed_payload = parsed_payload.transform_keys(&:to_sym) if symbolize && parsed_payload.is_a?(Hash)
5559
return parsed_payload
56-
rescue JSON::ParserError
57-
# If JSON parsing fails, return raw body
60+
rescue JSON::ParserError, ArgumentError => e
61+
# If JSON parsing fails or security limits exceeded, return raw body
62+
# Log security violations at debug level to avoid log spam
63+
if e.message.include?("nesting") || e.message.include?("depth")
64+
log.debug("JSON parsing security limit exceeded: #{e.message}")
65+
end
5866
end
5967
end
6068

@@ -79,6 +87,30 @@ def load_handler(handler_class_name)
7987

8088
private
8189

90+
# Safely parse JSON with security limits to prevent JSON bombs
91+
#
92+
# @param json_string [String] The JSON string to parse
93+
# @return [Hash, Array] Parsed JSON object
94+
# @raise [JSON::ParserError] If JSON is invalid
95+
# @raise [ArgumentError] If security limits are exceeded
96+
def safe_json_parse(json_string)
97+
# Security limits for JSON parsing
98+
max_nesting = ENV.fetch("JSON_MAX_NESTING", "20").to_i
99+
max_create_depth = ENV.fetch("JSON_MAX_CREATE_DEPTH", "15").to_i
100+
101+
# Additional size check before parsing
102+
if json_string.length > ENV.fetch("JSON_MAX_SIZE", "10485760").to_i # 10MB default
103+
raise ArgumentError, "JSON payload too large for parsing"
104+
end
105+
106+
JSON.parse(json_string, {
107+
max_nesting: max_nesting,
108+
create_additions: false, # Security: Disable object creation from JSON
109+
object_class: Hash, # Use plain Hash instead of custom classes
110+
array_class: Array # Use plain Array instead of custom classes
111+
})
112+
end
113+
82114
# Determine HTTP error code from exception
83115
#
84116
# @param exception [Exception] The exception to map to an HTTP status code

0 commit comments

Comments
 (0)