System Design — Quick Reference
Client → Middleware Chain → Handler → Service → Repository → PostgreSQL
└→ Cache (Redis)
Prometheus scrapes /metrics every 15s → Grafana visualizes
Middleware order : Logger → Metrics → Error Handler → Recovery → Body Size Limiter → CORS → Rate Limiter
cmd/server/main.go Entry point, wires all dependencies
cmd/migrate/main.go Migration CLI (up/down/goto/version)
internal/
config/ Viper config (YAML + env vars)
db/ GORM + PostgreSQL connection
cache/ Redis interface (Get/Set/Increment)
errors/ Structured error types & middleware
health/ Liveness & readiness probes
middleware/ Logger, Metrics, Rate Limiter, Pagination
payment/ Full domain: handler → service → repository
server/ Router setup
migrations/ SQL files (timestamped up/down pairs)
1. Logger — assigns Request ID, logs method/path/status/duration
2. Metrics — records http_requests_total, http_request_duration_seconds
3. Error Handler — converts c.Error() into structured JSON response
4. Recovery — catches panics
5. Body Size Limiter — rejects requests > 64KB
6. CORS — sets cross-origin headers
7. Rate Limiter — IP-based, Redis (distributed) or LRU (in-memory)
8. Handler — validates input, calls Service
9. Service — business logic, idempotency check, calls Repository
10. Repository — GORM queries to PostgreSQL
Payment Flow (POST /api/v1/pay)
Client
│
│ POST /api/v1/pay { user_id, amount, transaction_id }
▼
┌──────────────────────────────────────────────────────────────────┐
│ Middleware Chain │
│ Logger → Metrics → Error Handler → Recovery → CORS → Rate Limit │
└──────────────────────────┬───────────────────────────────────────┘
│
▼
┌──────────────┐
│ Handler │ Validate JSON body (binding tags)
└──────┬───────┘
│
▼
┌──────────────┐ ┌───────────────────┐
│ Service │────►│ Redis Cache │
│ │ │ idempotency:{txID}│
│ 1. Check │ └───────────────────┘
│ idempotency │ │
│ │ cache hit?
│ │ ┌────┴────┐
│ │ │ YES │ NO
│ │ ▼ ▼
│ │ Return ┌──────────────────┐
│ │ cached TX │ Repository │
│ │ │ FindByTxID (DB) │
│ │ └────────┬─────────┘
│ │ │
│ │ found in DB?
│ │ ┌─────┴─────┐
│ │ │ YES │ NO
│ │ ▼ ▼
│ │ Cache it ┌─────────────────┐
│ │ & return │ 2. Validate User│
│ │ └────────┬────────┘
│ │ │
│ │ ┌─────────────────┼──────────────┐
│ │ ▼ ▼
│ │ ┌──────────────────┐ ┌───────────────────┐
│ │ │ Redis Cache │ │ Repository │
│ │ │ user_exists:{uid} │ │ UserExists (DB) │
│ │ └──────────────────┘ └───────────────────┘
│ │ │ │
│ │ ▼ ▼
│ │ found? → OK found? → cache "1"
│ │ not found → check DB not found → ErrUserNotFound
│ │ │
│ │ ▼
│ 3. Create │ ┌─────────────────┐
│ Transaction │─────────────►│ Repository │
│ │ │ Create (DB) │
└──────┬───────┘ └────────┬────────┘
│ │
│ unique constraint
│ violation? (race)
│ ┌───────┴───────┐
│ │ YES │ NO
│ ▼ ▼
│ Re-fetch & Cache new TX
│ return existing in Redis (24h)
│ │
▼ │
┌──────────────┐ │
│ Handler │◄───────────────────────────┘
│ Map result │
│ to HTTP │
└──────┬───────┘
│
▼
┌─────────────────┐
│ HTTP Response │
│ 200: success │
│ 400: validation │
│ 404: user !exist│
│ 429: rate limit │
│ 500: server err │
└─────────────────┘
Key properties:
Idempotent: Same transaction_id always returns the same result (two-tier: Redis → DB)
Race-safe: Concurrent duplicates handled via DB unique constraint + re-fetch
Cache-resilient: Redis unavailable → falls back to DB-only path
User cache: Validated users cached 5min to reduce DB lookups
Clean Architecture (Payment example)
Layer
Knows
Does NOT know
Handler
HTTP, JSON, status codes
DB, cache
Service
Business rules, idempotency
HTTP, JSON
Repository
GORM, SQL
Business rules
Endpoint
Purpose
GET /health
Always 200
GET /health/live
Liveness — is process alive?
GET /health/ready
Readiness — checks DB, returns 503 if unhealthy
GET /metrics
Prometheus scrape target
GET /swagger/*
Interactive API docs (non-production only)
POST /api/v1/pay
Idempotent payment processing
Purpose
Key
TTL
Idempotency
idempotency:{transactionID}
24h
User existence cache
user_exists:{userID}
5min
Rate limiting
ratelimit:{IP}
window
{
"success" : false ,
"error" : {
"code" : " VALIDATION_ERROR" ,
"message" : " Validation failed" ,
"details" : { "amount" : " must be at least 1" },
"request_id" : " uuid" ,
"path" : " /api/v1/pay" ,
"timestamp" : " 2026-03-06T12:00:00Z"
}
}
Redis available → distributed sliding-window counter (shared across instances)
Redis down → in-memory token-bucket per instance (LRU, 5000 slots)
Redis fails mid-request → fail-open (request allowed)
/health/live — no dependency checks, never restarts the app
/health/ready — checks PostgreSQL response time; pass <100ms, warn 100–500ms, fail >500ms
Dev : full Go toolchain + Air hot-reload, source mounted as volume
Prod : multi-stage build, Alpine + binary only (~20MB)
DB and Redis ports internal-only (not exposed to host)
Config priority (highest → lowest)
Environment variables → config.{env}.yaml → config.yaml → hardcoded defaults
Config → PostgreSQL → Migrations check → Redis (optional) → Repo → Service → Handler → Router → HTTP server
SIGINT/SIGTERM → stop accepting → drain in-flight requests (30s) → close DB → close Redis → exit 0
Decision
Reason
Clean Architecture
Testable, swappable layers
Interfaces everywhere
Mock in tests, swap implementations
Redis optional
Availability over strict caching
Two-tier idempotency
Cache for speed, DB for durability
SQL migrations
Explicit, reversible, reviewable
Fail-open rate limiting
Redis outage shouldn't block users
Separate liveness/readiness
Kubernetes: "restart" vs "drain traffic"
Graceful shutdown
Zero-downtime deploys, no broken transactions
Prometheus histograms
Aggregatable across multiple instances
Request ID on every error
Trace a request across logs without log access