-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Core Service Architecture & Configuration Management
Overview
Set up the foundational architecture for the Hetzner Certificate Rotation Service, including the main service structure, configuration management system, and systemd integration. This issue establishes the core framework upon which all other components will be built.
Requirements
Project Structure
Create the base Go project structure:
services/certificates/refresher/
├── cmd/
│ └── rotator/
│ └── main.go # Service entry point
├── internal/
│ ├── config/
│ │ └── config.go # Configuration management
│ └── monitoring/
│ └── logging.go # Basic structured logging setup
├── pkg/
│ └── errors/
│ └── errors.go # Custom error types
├── configs/
│ ├── config.yaml.example
│ └── bootstrap.yaml.example
├── deployments/
│ └── systemd/
│ └── hetzner-cert-rotation.service
├── go.mod
├── go.sum
└── README.md
Configuration Management
Implement configuration loading and validation for two configuration files:
Main Configuration (config.yaml)
service:
name: "hetzner-cert-rotation"
pid_file: "/var/run/hetzner-cert-rotation.pid"
log_level: "info"
log_format: "json"
machine:
identity: "hetzner-build-01" # Unique machine identifier
role: "build-server" # build-server, application-server, etc.
environment: "prod" # dev, preprod, prod
certificates:
storage_path: "/etc/certs"
permissions:
cert_file: 0644
key_file: 0600
directory: 0755
backup_retention: 3 # Keep 3 previous certificate versions
rotation:
check_interval: "1h" # How often to check certificate expiry
renewal_threshold: 0.3 # Renew at 30% of lifetime remaining
renewal_window: "72h" # Start renewal attempts this far before expiry
max_attempts: 5 # Maximum renewal attempts before alerting
certificate_api:
base_url: "https://certificate-api.internal"
endpoints:
issue: "/api/v1/certificates/issue"
renew: "/api/v1/certificates/renew"
status: "/api/v1/certificates/{serial}"
ca_chain: "/api/v1/certificates/ca"
timeout: "30s"
retry_attempts: 3
retry_backoff_intervals: ["1m", "5m", "15m"]
keycloak:
base_url: "https://keycloak.internal"
realm: "main"
client_id: "hetzner-machine"
token_endpoint: "/realms/main/protocol/openid-connect/token"
jwt_cache_duration: "50m" # Refresh JWT 10 minutes before expiry
network:
health_check_interval: "30s"
connectivity_timeout: "10s"
degraded_mode_threshold: "5m" # Enter degraded mode after 5 minutes offline
monitoring:
metrics_port: 9091
health_port: 8081
services:
reload_commands:
nginx: "systemctl reload nginx"
haproxy: "systemctl reload haproxy"
docker: "systemctl restart docker"Bootstrap Configuration (bootstrap.yaml)
bootstrap:
certificate_path: "/etc/certs/bootstrap/cert.pem"
private_key_path: "/etc/certs/bootstrap/key.pem"
ca_chain_path: "/etc/certs/bootstrap/ca-chain.pem"
initial_renewal:
certificate_profile: "machine"
common_name: "${MACHINE_IDENTITY}"
san_entries:
- "${MACHINE_IDENTITY}.internal"
- "${MACHINE_IDENTITY}.hetzner"
validity_days: 7Main Service Entry Point
Implement cmd/rotator/main.go with:
- Command-line flag parsing
- Configuration file loading with validation
- Environment variable substitution (for
${MACHINE_IDENTITY}) - Graceful shutdown handling (SIGTERM, SIGINT)
- Signal handling for manual operations (SIGUSR1 for manual renewal trigger)
- PID file management
- Basic service lifecycle logging
Structured Logging
Set up structured logging with:
- JSON format support
- Log levels (debug, info, warn, error)
- Contextual fields (machine identity, environment, etc.)
- Log rotation integration with systemd/journald
Error Types
Define custom error types in pkg/errors/errors.go:
type ErrorCode string
const (
ErrCodeConfigInvalid ErrorCode = "CONFIG_INVALID"
ErrCodeCertificateExpired ErrorCode = "CERT_EXPIRED"
ErrCodeNetworkUnavailable ErrorCode = "NETWORK_UNAVAILABLE"
ErrCodeAuthFailed ErrorCode = "AUTH_FAILED"
ErrCodeRenewalFailed ErrorCode = "RENEWAL_FAILED"
ErrCodeServiceReloadFailed ErrorCode = "SERVICE_RELOAD_FAILED"
)
type ServiceError struct {
Code ErrorCode
Message string
Details map[string]interface{}
Cause error
}Systemd Service Unit
Create deployments/systemd/hetzner-cert-rotation.service:
[Unit]
Description=Hetzner Certificate Rotation Service
After=network-online.target tailscaled.service
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/local/bin/hetzner-cert-rotation -config /etc/hetzner-cert-rotation/config.yaml
ExecReload=/bin/kill -SIGUSR1 $MAINPID
Restart=always
RestartSec=30
User=cert-rotation
Group=ssl-cert
StandardOutput=journal
StandardError=journal
SyslogIdentifier=hetzner-cert-rotation
# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/etc/certs /var/run
[Install]
WantedBy=multi-user.targetAcceptance Criteria
-
Configuration Loading
- Successfully loads and validates both config.yaml and bootstrap.yaml
- Supports environment variable substitution
- Validates all required fields and value ranges
- Provides clear error messages for configuration issues
-
Service Lifecycle
- Service starts and creates PID file
- Responds to SIGTERM/SIGINT with graceful shutdown
- Responds to SIGUSR1 for manual trigger (logs receipt for now)
- Integrates with systemd notify protocol
-
Logging
- Outputs structured JSON logs at configured level
- Includes contextual information (machine identity, etc.)
- Properly integrates with systemd journal
-
Error Handling
- Uses custom error types consistently
- Provides detailed error context for troubleshooting
Implementation Notes
- Use viper or similar for configuration management
- Use zerolog or zap for structured logging
- Implement configuration validation using struct tags
- Ensure all file paths are validated for existence/permissions where appropriate
- Add configuration hot-reload capability if time permits (watch config file for changes)
Dependencies
- Go 1.21 or later
- No external service dependencies for this phase
- Standard library plus approved third-party packages (viper, zerolog/zap)
Testing Requirements
- Unit tests for configuration loading and validation
- Unit tests for error type creation and formatting
- Integration test for service startup and shutdown
- Test environment variable substitution
- Test signal handling (SIGTERM, SIGINT, SIGUSR1)
Metadata
Metadata
Assignees
Labels
No labels