Performance and Scaling

This guide covers performance tuning, scaling strategies, capacity planning, and optimization techniques for TMI deployments.

Overview

TMI performance optimization and scaling involves:

Application performance tuning
Database optimization and scaling
Cache performance optimization
Horizontal and vertical scaling strategies
Load balancing and high availability
Capacity planning and monitoring

Quick Performance Checks

Application Performance

# Check response times
curl -w "@curl-format.txt" -o /dev/null -s https://api.tmi.example.com/version

# curl-format.txt:
time_total:        %{time_total}\n
time_connect:      %{time_connect}\n
time_starttransfer:%{time_starttransfer}\n
size_download:     %{size_download}\n

# Load test with Apache Bench
ab -n 1000 -c 10 https://api.tmi.example.com/version

# WebSocket connection test
wscat -c "wss://api.tmi.example.com/ws/diagrams/{id}" \
  -H "Authorization: Bearer $TOKEN"

Database Performance

-- Check slow queries
SELECT
  query,
  mean_time,
  calls,
  total_time
FROM pg_stat_statements
WHERE mean_time > 100
ORDER BY mean_time DESC
LIMIT 10;

-- Check database size and growth
SELECT
  pg_database.datname,
  pg_size_pretty(pg_database_size(pg_database.datname)) AS size
FROM pg_database;

-- Check connection count
SELECT count(*) FROM pg_stat_activity;

Cache Performance

# Redis hit rate
redis-cli -h redis-host -a password info stats | \
  awk '/keyspace_hits|keyspace_misses/ {
    split($0,a,":");
    if ($1 ~ /hits/) hits=a[2];
    if ($1 ~ /misses/) misses=a[2]
  }
  END {
    total=hits+misses;
    rate=(hits/total)*100;
    printf "Hit Rate: %.2f%%\n", rate
  }'

# Memory usage
redis-cli -h redis-host -a password info memory | grep used_memory_human

Application Performance Tuning

Server Configuration

Timeout Settings

Optimize HTTP timeouts for your workload:

# config-production.yml
server:
  read_timeout: 5s       # Time to read request
  write_timeout: 10s     # Time to write response
  idle_timeout: 60s      # Idle connection timeout

For high-latency clients or large payloads:

server:
  read_timeout: 15s
  write_timeout: 30s
  idle_timeout: 120s

Via environment:

SERVER_READ_TIMEOUT=15s
SERVER_WRITE_TIMEOUT=30s
SERVER_IDLE_TIMEOUT=120s

WebSocket Configuration

# WebSocket inactivity timeout
WEBSOCKET_INACTIVITY_TIMEOUT_SECONDS=300  # 5 minutes

# For high-activity collaboration
WEBSOCKET_INACTIVITY_TIMEOUT_SECONDS=600  # 10 minutes

Resource Limits

Go Runtime Tuning

# Set maximum Go processes (default: number of CPU cores)
GOMAXPROCS=8

# Garbage collection tuning
GOGC=100  # Default - adjust based on memory patterns

# For memory-constrained environments
GOGC=80   # More frequent GC, lower memory usage

# For CPU-constrained environments
GOGC=200  # Less frequent GC, higher memory usage

System Resource Limits

For systemd service:

# /etc/systemd/system/tmi.service
[Service]
# Maximum processes
LimitNPROC=512

# Maximum open files
LimitNOFILE=65536

# Memory limit
MemoryLimit=1G

# CPU limit (100% of one core)
CPUQuota=100%

For Docker:

docker run -d \
  --name tmi-server \
  --memory="1g" \
  --cpus="2.0" \
  --ulimit nofile=65536:65536 \
  tmi/tmi-server:latest

For Kubernetes:

resources:
  requests:
    memory: "512Mi"
    cpu: "500m"
  limits:
    memory: "1Gi"
    cpu: "2000m"

Logging Configuration

Optimize logging for performance:

logging:
  level: "info"                    # Use 'warn' or 'error' for production
  log_api_requests: false          # Disable in high-traffic production
  log_api_responses: false         # Disable to reduce I/O
  log_websocket_messages: false    # Disable for performance
  redact_auth_tokens: true         # Security
  suppress_unauth_logs: true       # Reduce noise

For high-performance production:

LOGGING_LEVEL=warn
LOGGING_LOG_API_REQUESTS=false
LOGGING_LOG_API_RESPONSES=false
LOGGING_LOG_WEBSOCKET_MESSAGES=false

Database Performance Tuning

PostgreSQL Configuration

Connection Pool Optimization

Configure connection pooling:

database:
  postgres:
    max_open_conns: 25      # Max concurrent connections
    max_idle_conns: 5       # Idle connections to maintain
    conn_max_lifetime: 5m   # Connection lifetime

Sizing guidelines:

Small deployment (< 100 users):
- max_open_conns: 10
- max_idle_conns: 2
Medium deployment (100-1000 users):
- max_open_conns: 25
- max_idle_conns: 5
Large deployment (1000+ users):
- max_open_conns: 50
- max_idle_conns: 10

PostgreSQL Server Settings

Edit /etc/postgresql/*/main/postgresql.conf:

# Memory Settings
shared_buffers = 256MB           # 25% of RAM (for dedicated server)
effective_cache_size = 1GB       # 50-75% of RAM
work_mem = 16MB                  # Per-operation memory
maintenance_work_mem = 64MB      # For VACUUM, CREATE INDEX

# Connection Settings
max_connections = 100            # Adjust based on connection pool

# Query Planner
random_page_cost = 1.1           # For SSD (default 4.0 for HDD)
effective_io_concurrency = 200   # For SSD (default 1)

# Write Performance
wal_buffers = 16MB
checkpoint_completion_target = 0.9

For production with 4GB RAM:

shared_buffers = 1GB
effective_cache_size = 3GB
work_mem = 32MB
maintenance_work_mem = 256MB

Restart PostgreSQL after changes:

sudo systemctl restart postgresql

Index Optimization

Check for missing indexes:

-- Tables with high sequential scan counts
SELECT
  schemaname,
  tablename,
  seq_scan,
  idx_scan,
  seq_tup_read,
  CASE
    WHEN seq_scan > 0 THEN seq_tup_read / seq_scan
    ELSE 0
  END AS avg_seq_tup_per_scan
FROM pg_stat_user_tables
WHERE seq_scan > 0
  AND schemaname = 'public'
ORDER BY seq_tup_read DESC
LIMIT 20;

TMI's key indexes (already created by migrations):

-- Primary key indexes (automatic)
-- Foreign key indexes
CREATE INDEX idx_threats_threat_model_id ON threats(threat_model_id);
CREATE INDEX idx_diagrams_threat_model_id ON diagrams(threat_model_id);

-- Query optimization indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_threats_threat_model_id_created_at ON threats(threat_model_id, created_at);

Check index usage:

SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan,
  idx_tup_read
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan DESC;

-- Find unused indexes
SELECT
  schemaname,
  tablename,
  indexname
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexname NOT LIKE '%_pkey'
  AND schemaname = 'public';

Query Optimization

Analyze slow queries:

-- Enable query timing
\timing on

-- Example query analysis
EXPLAIN ANALYZE
SELECT * FROM threats
WHERE threat_model_id = 'uuid-here'
ORDER BY created_at DESC
LIMIT 50;

Optimize query patterns:

-- Use LIMIT for large result sets
SELECT * FROM threats LIMIT 50;

-- Use appropriate indexes
-- Good: Uses index
SELECT * FROM threats WHERE threat_model_id = 'uuid';

-- Bad: Full table scan
SELECT * FROM threats WHERE lower(title) LIKE '%search%';

-- Better: Use functional index
CREATE INDEX idx_threats_title_lower ON threats(lower(title));

Vacuum and Analyze

Regular maintenance:

# Manual vacuum and analyze
psql -h postgres-host -U tmi_user -d tmi -c "VACUUM ANALYZE;"

# Check last vacuum/analyze
psql -h postgres-host -U tmi_user -d tmi -c "
  SELECT
    schemaname,
    tablename,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze,
    n_dead_tup
  FROM pg_stat_user_tables
  ORDER BY n_dead_tup DESC"

Configure autovacuum in postgresql.conf:

autovacuum = on
autovacuum_max_workers = 3
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50

PostgreSQL Scaling

Read Replicas

For read-heavy workloads, add read replicas:

# Configure read replica
database:
  postgres:
    primary:
      host: "postgres-primary"
      port: 5432
    replicas:
      - host: "postgres-replica-1"
        port: 5432
      - host: "postgres-replica-2"
        port: 5432

Replication setup:

# On primary server (postgresql.conf)
wal_level = replica
max_wal_senders = 3
wal_keep_size = 1GB

# Create replication user
psql -U postgres -c "CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'password';"

# On replica server
# Use pg_basebackup to initialize replica
pg_basebackup -h primary-host -D /var/lib/postgresql/data -U replicator -P -v

Connection Pooling (PgBouncer)

For high-connection environments:

# Install PgBouncer
sudo apt-get install pgbouncer

# Configure /etc/pgbouncer/pgbouncer.ini
[databases]
tmi = host=postgres-host port=5432 dbname=tmi

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

# Start PgBouncer
systemctl start pgbouncer

# Update TMI to use PgBouncer
POSTGRES_HOST=localhost
POSTGRES_PORT=6432

Redis Performance Tuning

Memory Configuration

# Edit /etc/redis/redis.conf

# Set memory limit
maxmemory 1gb

# Eviction policy
maxmemory-policy allkeys-lru  # Evict least recently used keys
# Or: volatile-lru (only evict keys with TTL)

# Memory optimization
hash-max-ziplist-entries 512
hash-max-ziplist-value 64

Persistence Configuration

Balance performance vs durability:

# For performance (may lose data on crash)
appendonly no
save ""

# Balanced (recommended)
appendonly yes
appendfsync everysec
save 900 1
save 300 10

# For durability (slower writes)
appendonly yes
appendfsync always

Redis Optimization

# Disable slow commands
rename-command KEYS ""
rename-command FLUSHALL ""

# TCP backlog
tcp-backlog 511

# TCP keepalive
tcp-keepalive 300

# Lazy freeing
lazyfree-lazy-eviction yes
lazyfree-lazy-expire yes

Cache TTL Strategy

TMI's cache TTL configuration:

Cache Type	TTL	Justification
Threat Models	10 minutes	Core entities, moderate updates
Diagrams	2 minutes	High collaboration, real-time
Sub-resources	5 minutes	Threats, documents, sources
Authorization	15 minutes	Security-critical, infrequent changes
Metadata	7 minutes	Flexible data, moderate updates
Lists	5 minutes	Paginated results

Adjust based on your usage patterns:

// For high-collaboration environments (reduce TTL)
cache.Set("threat_model:"+id, data, 5*time.Minute)

// For read-heavy environments (increase TTL)
cache.Set("threat_model:"+id, data, 15*time.Minute)

Scaling Strategies

Horizontal Scaling

Load Balancing

Nginx load balancer:

# /etc/nginx/conf.d/tmi-upstream.conf
upstream tmi_backend {
    least_conn;  # Or: ip_hash for sticky sessions
    server tmi-server-1:8080 max_fails=3 fail_timeout=30s;
    server tmi-server-2:8080 max_fails=3 fail_timeout=30s;
    server tmi-server-3:8080 max_fails=3 fail_timeout=30s;
}

server {
    listen 443 ssl http2;
    server_name api.tmi.example.com;

    location / {
        proxy_pass http://tmi_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        # WebSocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

HAProxy load balancer:

# /etc/haproxy/haproxy.cfg
frontend tmi_front
    bind *:443 ssl crt /etc/ssl/certs/tmi.pem
    default_backend tmi_back

backend tmi_back
    balance leastconn
    option httpchk GET /version
    http-check expect status 200
    server tmi1 tmi-server-1:8080 check
    server tmi2 tmi-server-2:8080 check
    server tmi3 tmi-server-3:8080 check

Docker Compose Scaling

# Scale to 3 instances
docker-compose up -d --scale tmi-server=3

# With explicit configuration
docker-compose -f docker-compose.yml -f docker-compose.scale.yml up -d

# docker-compose.scale.yml
version: "3.8"
services:
  tmi-server:
    deploy:
      replicas: 3

Kubernetes Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: tmi-server-hpa
  namespace: tmi
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: tmi-server
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Vertical Scaling

Server Resources

Increase Docker container resources:

docker update tmi-server --memory="2g" --cpus="4.0"

Kubernetes resource increase:

resources:
  requests:
    memory: "1Gi"
    cpu: "1000m"
  limits:
    memory: "2Gi"
    cpu: "4000m"

Heroku dyno scaling:

# Scale to larger dyno type
heroku ps:resize web=standard-2x --app tmi-server

# Or Performance tier
heroku ps:resize web=performance-m --app tmi-server

Database Scaling

PostgreSQL vertical scaling:

# Increase shared_buffers (requires restart)
ALTER SYSTEM SET shared_buffers = '2GB';
SELECT pg_reload_conf();  # Or restart PostgreSQL

# Increase work_mem (no restart)
ALTER SYSTEM SET work_mem = '64MB';
SELECT pg_reload_conf();

Redis vertical scaling:

# Increase memory limit
redis-cli CONFIG SET maxmemory 2gb

# Make permanent in redis.conf
echo "maxmemory 2gb" >> /etc/redis/redis.conf

Geographic Distribution

For global deployments:

┌─────────────────────┐
│  Global Load Balancer│
└──────────┬───────────┘
           │
    ┌──────┴──────┐
    │             │
┌───▼────┐    ┌───▼────┐
│ US-East│    │ EU-West│
│ Region │    │ Region │
└────────┘    └────────┘
    │             │
 TMI+DB+Cache  TMI+DB+Cache

Consider:

Regional deployments
Database replication across regions
CDN for static assets
DNS-based routing

Capacity Planning

Resource Monitoring

Track key metrics for capacity planning:

-- Database growth rate
SELECT
  date_trunc('month', created_at) AS month,
  count(*) AS records
FROM threat_models
GROUP BY month
ORDER BY month;

-- User growth
SELECT
  date_trunc('week', created_at) AS week,
  count(*) AS new_users
FROM users
GROUP BY week
ORDER BY week;

Capacity Thresholds

Set alerts for capacity thresholds:

CPU: Alert at 70%, critical at 85%
Memory: Alert at 75%, critical at 90%
Disk: Alert at 75%, critical at 90%
Database connections: Alert at 70% of max
Redis memory: Alert at 80%, critical at 95%

Growth Projections

Calculate growth rates:

# Database size growth
# Current size: 5GB
# Growth: 100MB/month
# Projected size in 12 months: 5GB + (100MB * 12) = 6.2GB

# User growth
# Current: 100 users
# Growth: 20% month-over-month
# Projected in 12 months: 100 * (1.2^12) = ~900 users

Capacity Planning Checklist

Monitor resource utilization trends
Project growth rates (users, data, traffic)
Calculate resource needs for 6-12 months
Plan scaling activities before reaching thresholds
Budget for infrastructure growth
Test scaling procedures in staging
Document capacity baselines

Performance Benchmarking

Application Benchmarks

# HTTP endpoint benchmarking with Apache Bench
ab -n 10000 -c 100 -H "Authorization: Bearer $TOKEN" \
  https://api.tmi.example.com/api/threat-models

# WebSocket benchmarking
# Install: npm install -g websocket-bench
wsbench -c 100 -n 1000 wss://api.tmi.example.com/ws/diagrams/{id} \
  -H "Authorization: Bearer $TOKEN"

# Full load testing with k6
k6 run load-test.js

Example k6 script (load-test.js):

import http from 'k6/http';
import { check, sleep } from 'k6';

export let options = {
  stages: [
    { duration: '2m', target: 100 },  // Ramp up
    { duration: '5m', target: 100 },  // Stay at 100 users
    { duration: '2m', target: 0 },    // Ramp down
  ],
};

export default function() {
  let response = http.get('https://api.tmi.example.com/api/threat-models', {
    headers: { 'Authorization': `Bearer ${__ENV.TOKEN}` },
  });

  check(response, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });

  sleep(1);
}

Run benchmark:

TOKEN=$YOUR_TOKEN k6 run load-test.js

Database Benchmarks

# PostgreSQL benchmarking with pgbench
createdb pgbench_test
pgbench -i -s 10 pgbench_test  # Initialize
pgbench -c 10 -j 2 -t 1000 pgbench_test  # Run benchmark

# Results show:
# - Transactions per second (TPS)
# - Average latency
# - Connection overhead

Performance Monitoring Dashboards

Key Performance Indicators (KPIs)

Application KPIs:

Request throughput (requests/second)
Response time percentiles (P50, P95, P99)
Error rate (percentage of 5xx responses)
WebSocket connection count
Active user sessions

Database KPIs:

Query response time
Connection count
Cache hit ratio
Replication lag
Table sizes

Infrastructure KPIs:

CPU utilization
Memory utilization
Disk I/O
Network throughput
Container restarts

Grafana Dashboard Examples

Create dashboards tracking:

System Overview:

Service uptime (%)
Request rate (req/s)
Error rate (%)
Active users
Response time (P95)

Database Performance:

Query duration (ms)
Connection count
Slow queries
Cache hit rate
Database size

Resource Utilization:

CPU usage (%)
Memory usage (%)
Disk usage (%)
Network I/O (MB/s)

Troubleshooting Performance Issues

High Response Times

Check:

Database query performance
Cache hit rates
Network latency
Application logs for errors
Resource utilization (CPU, memory)

Solutions:

Optimize slow queries
Add missing indexes
Increase cache TTL
Scale horizontally
Optimize code

High CPU Usage

Check:

# Process CPU usage
top -p $(pgrep tmi-server)

# System CPU by process
ps aux --sort=-%cpu | head

Solutions:

Profile application (Go pprof)
Optimize hot code paths
Reduce logging
Scale horizontally

Memory Leaks

Check:

# Memory usage over time
docker stats tmi-server --no-stream

# Go heap profile
curl http://localhost:8080/debug/pprof/heap > heap.prof
go tool pprof heap.prof

Solutions:

Analyze heap dump
Fix memory leaks in code
Increase garbage collection frequency
Restart services periodically

Database Connection Exhaustion

Check:

SELECT count(*) FROM pg_stat_activity;

Performance and Scaling

Performance and Scaling

Overview

Quick Performance Checks

Application Performance

Database Performance

Cache Performance

Application Performance Tuning

Server Configuration

Timeout Settings

WebSocket Configuration

Resource Limits

Go Runtime Tuning

System Resource Limits

Logging Configuration

Database Performance Tuning

PostgreSQL Configuration

Connection Pool Optimization

PostgreSQL Server Settings

Index Optimization

Query Optimization

Vacuum and Analyze

PostgreSQL Scaling

Read Replicas

Connection Pooling (PgBouncer)

Redis Performance Tuning

Memory Configuration

Persistence Configuration

Redis Optimization

Cache TTL Strategy

Scaling Strategies

Horizontal Scaling

Load Balancing

Docker Compose Scaling

Kubernetes Horizontal Pod Autoscaler

Vertical Scaling

Server Resources

Database Scaling

Geographic Distribution

Capacity Planning

Resource Monitoring

Capacity Thresholds

Growth Projections

Capacity Planning Checklist

Performance Benchmarking

Application Benchmarks

Database Benchmarks

Performance Monitoring Dashboards

Key Performance Indicators (KPIs)

Grafana Dashboard Examples

Troubleshooting Performance Issues

High Response Times

High CPU Usage

Memory Leaks

Database Connection Exhaustion

Related Documentation

Additional Resources

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!