Skip to content

Security Operations

Eric Fitzgerald edited this page Nov 12, 2025 · 1 revision

Security Operations

This guide covers container security, TLS management, secrets management, security monitoring, and incident response for TMI operations.

Overview

TMI security operations encompasses:

  • Container security scanning and vulnerability management
  • TLS/SSL certificate management
  • Secrets and credentials management
  • Security monitoring and logging
  • Incident response procedures
  • Access control and authentication

Container Security

Security Scanning

TMI includes comprehensive container security scanning using Docker Scout.

Quick Security Scan

# Scan all containers for vulnerabilities
make containers-security-scan

# Generate security reports
make containers-security-report

# View security summary
cat security-reports/security-summary.md

Building Secure Containers

# Build containers with security patches
make containers-secure-build

# Build specific secure container
./scripts/build-secure-containers.sh postgresql

# Start development with secure containers
make containers-secure-dev

# Full security workflow
make containers-secure

Enhanced Dockerfiles

TMI provides security-hardened Dockerfiles:

  • Dockerfile.postgres.secure - PostgreSQL with patches
  • Dockerfile.redis.secure - Redis with security updates
  • Dockerfile.dev.secure - Application with hardening

Security improvements:

  • Automated vulnerability patching
  • Non-root user execution
  • Minimal attack surface
  • Security metadata labels
  • Enhanced logging

Vulnerability Scanning

Docker Scout Integration

# Scan specific image
docker scout cves tmi/tmi-postgresql:latest --only-severity critical,high

# Detailed scan with SARIF output
docker scout cves tmi/tmi-server:latest --format sarif --output security.sarif

# Get recommendations
docker scout recommendations tmi/tmi-postgresql:latest

CI/CD Security Scanning

Run automated security scans in CI/CD:

# Basic CI scan
./scripts/ci-security-scan.sh

# Custom thresholds
MAX_CRITICAL_CVES=0 MAX_HIGH_CVES=5 ./scripts/ci-security-scan.sh

# Scan specific images
IMAGES_TO_SCAN="tmi/tmi-server:latest redis:7" ./scripts/ci-security-scan.sh

Environment variables:

Variable Default Description
MAX_CRITICAL_CVES 0 Maximum critical CVEs allowed
MAX_HIGH_CVES 3 Maximum high CVEs allowed
MAX_MEDIUM_CVES 10 Maximum medium CVEs allowed
FAIL_ON_CRITICAL true Fail build on critical CVEs
FAIL_ON_HIGH false Fail build on high CVEs
ARTIFACT_DIR ./security-artifacts Report output directory

Security Reports

TMI generates multiple security report formats:

1. Summary Report (security-summary.md):

  • Vulnerability counts by severity
  • Pass/fail status by image
  • Remediation recommendations

2. Detailed Scan Results (security-scan-results.json):

  • Complete vulnerability details
  • CVSS scores and vectors
  • Affected packages and versions

3. SARIF Reports (security-results.sarif):

  • Standard security tool format
  • IDE and CI/CD integration
  • Machine-readable results

Security Thresholds

Configure acceptable vulnerability levels:

# Strict policy (production)
MAX_CRITICAL_CVES=0 MAX_HIGH_CVES=0 make containers-security-scan

# Lenient policy (development)
MAX_HIGH_CVES=10 FAIL_ON_HIGH=false make containers-security-scan

Default thresholds:

  • Critical CVEs: 0 (build fails)
  • High CVEs: 3 (warning)
  • Medium CVEs: 10 (informational)

Runtime Container Security

Container Hardening

Run containers with security restrictions:

# Docker run with security options
docker run -d \
  --name tmi-server \
  --read-only \
  --cap-drop ALL \
  --cap-add NET_BIND_SERVICE \
  --security-opt no-new-privileges \
  --user 1000:1000 \
  tmi/tmi-server:latest

Kubernetes Security Context

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL
    add:
      - NET_BIND_SERVICE

Security Best Practices

Regular updates:

  • Weekly: Update base images and rebuild
  • Monthly: Review security trends
  • Quarterly: Audit security policies

Layered security:

  • Image security: Patched base images
  • Runtime security: Resource limits, monitoring
  • Network security: Segmentation, firewalls
  • Access control: Least-privilege principles

TLS/SSL Management

Certificate Management

Generating Self-Signed Certificates

For development and testing:

# Generate self-signed certificate
openssl req -x509 -newkey rsa:4096 -nodes \
  -keyout server.key -out server.crt -days 365 \
  -subj "/CN=tmi.example.com"

# Generate with SAN (Subject Alternative Names)
openssl req -x509 -newkey rsa:4096 -nodes \
  -keyout server.key -out server.crt -days 365 \
  -extensions v3_req \
  -config <(cat /etc/ssl/openssl.cnf <(printf "\n[v3_req]\nsubjectAltName=DNS:tmi.example.com,DNS:*.tmi.example.com"))

# Set permissions
chmod 600 server.key
chmod 644 server.crt

Using Let's Encrypt

For production deployments:

# Install certbot
sudo apt-get update
sudo apt-get install certbot

# Obtain certificate (standalone mode)
sudo certbot certonly --standalone -d tmi.example.com -d api.tmi.example.com

# Certificates installed at:
# /etc/letsencrypt/live/tmi.example.com/fullchain.pem
# /etc/letsencrypt/live/tmi.example.com/privkey.pem

# Auto-renewal (already configured via systemd/cron)
sudo certbot renew --dry-run

Certificate Installation

For TMI Server:

# Copy certificates to secure location
sudo mkdir -p /etc/tmi/certs
sudo cp server.crt /etc/tmi/certs/
sudo cp server.key /etc/tmi/certs/
sudo chown tmi:tmi /etc/tmi/certs/*
sudo chmod 600 /etc/tmi/certs/server.key
sudo chmod 644 /etc/tmi/certs/server.crt

Configure TMI:

# config-production.yml
server:
  tls_enabled: true
  tls_cert_file: "/etc/tmi/certs/server.crt"
  tls_key_file: "/etc/tmi/certs/server.key"
  tls_subject_name: "tmi.example.com"
  http_to_https_redirect: true

Or via environment:

SERVER_TLS_ENABLED=true
SERVER_TLS_CERT_FILE=/etc/tmi/certs/server.crt
SERVER_TLS_KEY_FILE=/etc/tmi/certs/server.key
SERVER_TLS_SUBJECT_NAME=tmi.example.com
SERVER_HTTP_TO_HTTPS_REDIRECT=true

Certificate Verification

# Verify certificate details
openssl x509 -in /etc/tmi/certs/server.crt -text -noout

# Check certificate expiration
openssl x509 -in /etc/tmi/certs/server.crt -noout -dates

# Test TLS connection
openssl s_client -connect tmi.example.com:443 -servername tmi.example.com

# Verify certificate chain
openssl s_client -connect tmi.example.com:443 -showcerts

# Check certificate expiration (days remaining)
echo $(( ($(date -d "$(openssl x509 -enddate -noout -in /etc/tmi/certs/server.crt | cut -d= -f2)" +%s) - $(date +%s)) / 86400 ))

Certificate Renewal

Automated Let's Encrypt Renewal

Let's Encrypt certificates automatically renew via systemd timer:

# Check renewal timer status
systemctl status certbot.timer

# Test renewal
sudo certbot renew --dry-run

# Force renewal
sudo certbot renew --force-renewal

# Post-renewal hook (restart TMI)
cat > /etc/letsencrypt/renewal-hooks/post/restart-tmi.sh <<'EOF'
#!/bin/bash
systemctl restart tmi
EOF
chmod +x /etc/letsencrypt/renewal-hooks/post/restart-tmi.sh

Manual Certificate Renewal

For self-signed or purchased certificates:

# Generate new certificate
openssl req -x509 -newkey rsa:4096 -nodes \
  -keyout /etc/tmi/certs/server.key.new \
  -out /etc/tmi/certs/server.crt.new \
  -days 365 \
  -subj "/CN=tmi.example.com"

# Backup old certificates
cp /etc/tmi/certs/server.key /etc/tmi/certs/server.key.old
cp /etc/tmi/certs/server.crt /etc/tmi/certs/server.crt.old

# Install new certificates
mv /etc/tmi/certs/server.key.new /etc/tmi/certs/server.key
mv /etc/tmi/certs/server.crt.new /etc/tmi/certs/server.crt

# Set permissions
chmod 600 /etc/tmi/certs/server.key
chmod 644 /etc/tmi/certs/server.crt

# Restart TMI server
systemctl restart tmi

TLS Configuration Best Practices

Strong cipher suites:

# Nginx configuration
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';
ssl_prefer_server_ciphers on;

HSTS (HTTP Strict Transport Security):

add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

Certificate monitoring:

  • Monitor expiration dates (alert 30 days before expiry)
  • Track certificate renewals
  • Validate certificate chain integrity
  • Test TLS configuration regularly

Secrets Management

Environment Variables

Store secrets in environment variables, never in code.

Development (.env.dev):

JWT_SECRET=development-secret-do-not-use-in-production
POSTGRES_PASSWORD=dev_password
REDIS_PASSWORD=dev_redis_pass
OAUTH_PROVIDERS_GOOGLE_CLIENT_SECRET=your-dev-secret

Production (system environment):

# Set via systemd service
Environment=JWT_SECRET=<strong-production-secret>
Environment=POSTGRES_PASSWORD=<db-password>
Environment=REDIS_PASSWORD=<redis-password>
Environment=OAUTH_PROVIDERS_GOOGLE_CLIENT_SECRET=<oauth-secret>

Generating Strong Secrets

# Generate JWT secret (32 bytes)
openssl rand -base64 32

# Generate strong password
openssl rand -base64 24

# Generate UUID
uuidgen

# Generate hex secret
openssl rand -hex 32

Secrets Storage Solutions

HashiCorp Vault

# Store secret in Vault
vault kv put secret/tmi/production \
  jwt_secret="<secret>" \
  postgres_password="<password>" \
  redis_password="<password>"

# Retrieve secret
vault kv get -field=jwt_secret secret/tmi/production

# Configure TMI to use Vault
export VAULT_ADDR='https://vault.example.com'
export VAULT_TOKEN='<token>'

Kubernetes Secrets

apiVersion: v1
kind: Secret
metadata:
  name: tmi-secrets
  namespace: tmi
type: Opaque
data:
  jwt-secret: <base64-encoded-secret>
  postgres-password: <base64-encoded-password>
  redis-password: <base64-encoded-password>
---
# Reference in deployment
env:
  - name: JWT_SECRET
    valueFrom:
      secretKeyRef:
        name: tmi-secrets
        key: jwt-secret

AWS Secrets Manager

# Create secret
aws secretsmanager create-secret \
  --name tmi/production/jwt-secret \
  --secret-string "<your-secret>"

# Retrieve secret
aws secretsmanager get-secret-value \
  --secret-id tmi/production/jwt-secret \
  --query SecretString --output text

# Rotate secret
aws secretsmanager rotate-secret \
  --secret-id tmi/production/jwt-secret

Secret Rotation

Rotate secrets regularly:

JWT Secret rotation:

# Generate new secret
NEW_SECRET=$(openssl rand -base64 32)

# Update in environment/secrets manager
heroku config:set JWT_SECRET=$NEW_SECRET --app tmi-production

# Or for systemd service
sudo systemctl edit tmi
# Add: Environment=JWT_SECRET=<new-secret>
sudo systemctl restart tmi

Database password rotation:

-- Update user password
ALTER USER tmi_user WITH PASSWORD 'new_secure_password';

-- Update TMI configuration
-- Restart TMI with new password

Security Best Practices

  • Never commit secrets to git
  • Use different secrets per environment
  • Rotate secrets regularly (quarterly minimum)
  • Use secrets management systems (Vault, Secrets Manager)
  • Limit secret access (principle of least privilege)
  • Audit secret access (who accessed what, when)
  • Encrypt secrets at rest

Security Monitoring

Authentication Monitoring

Monitor authentication events:

# View authentication logs
tail -f /var/log/tmi/tmi.log | grep -E "authentication|authorization"

# Count failed login attempts
grep "authentication failed" /var/log/tmi/tmi.log | wc -l

# Identify suspicious activity (multiple failures from same IP)
grep "authentication failed" /var/log/tmi/tmi.log | \
  grep -oP 'ip=\K[0-9.]+' | sort | uniq -c | sort -rn

# Last 10 successful logins
grep "authentication successful" /var/log/tmi/tmi.log | tail -10

Security Event Logging

TMI logs security-relevant events:

{
  "timestamp": "2025-11-12T10:30:00Z",
  "level": "warn",
  "event": "authentication_failed",
  "user_email": "[email protected]",
  "provider": "google",
  "ip_address": "192.168.1.100",
  "reason": "invalid_token"
}

Key security events:

  • Authentication failures
  • Authorization denials
  • Token validation failures
  • Suspicious API usage patterns
  • Rate limit violations
  • Database access errors

Security Alerts

Configure alerts for security events:

# Alert on multiple failed logins
*/5 * * * * [ $(grep "authentication failed" /var/log/tmi/tmi.log | tail -100 | wc -l) -gt 10 ] && \
  echo "Multiple authentication failures detected" | mail -s "Security Alert" [email protected]

Prometheus alert example:

- alert: AuthenticationFailureSpike
  expr: rate(authentication_failures_total[5m]) > 10
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "High authentication failure rate"
    description: "More than 10 auth failures per second"

Access Control Monitoring

Track who accesses what:

# API access by user
grep "api_request" /var/log/tmi/tmi.log | \
  jq -r '.user_email' | sort | uniq -c | sort -rn

# Sensitive operations (delete, update permissions)
grep -E "DELETE|permission.*update" /var/log/tmi/tmi.log

# Failed authorization attempts
grep "authorization denied" /var/log/tmi/tmi.log

Incident Response

Incident Response Plan

1. Detection: Security event detected via monitoring/alerts

2. Triage: Assess severity and impact

  • Critical: Data breach, service compromise
  • High: Authentication bypass, privilege escalation
  • Medium: Configuration issue, policy violation
  • Low: Informational event

3. Containment: Limit damage

  • Isolate affected systems
  • Revoke compromised credentials
  • Block suspicious IP addresses
  • Disable vulnerable features

4. Investigation: Determine root cause

  • Review logs and audit trails
  • Identify attack vectors
  • Assess data exposure
  • Document timeline

5. Remediation: Fix the issue

  • Apply security patches
  • Update configurations
  • Rotate secrets/credentials
  • Strengthen access controls

6. Recovery: Restore normal operations

  • Verify system integrity
  • Restore from clean backups if needed
  • Re-enable services
  • Monitor for recurrence

7. Post-Incident: Learn and improve

  • Conduct post-mortem
  • Update runbooks
  • Implement preventive measures
  • Train staff

Common Security Incidents

Compromised Credentials

Response:

# 1. Revoke affected credentials
# For JWT tokens - add to blacklist in Redis
redis-cli -h redis-host -a password SET "blacklist:token:$TOKEN_ID" "1" EX 86400

# 2. Force password reset
psql -h postgres-host -U tmi_user -d tmi -c "
  UPDATE users SET password_reset_required = true
  WHERE email = '[email protected]'"

# 3. Generate new secrets
NEW_JWT_SECRET=$(openssl rand -base64 32)

# 4. Update configuration
heroku config:set JWT_SECRET=$NEW_JWT_SECRET --app tmi-production

# 5. Notify affected users

Container Vulnerability Detected

Response:

# 1. Scan all images
make containers-security-scan

# 2. Build patched containers
make containers-secure-build

# 3. Test in staging
docker-compose -f docker-compose.staging.yml up -d

# 4. Deploy to production
docker-compose -f docker-compose.prod.yml up -d

# 5. Verify fix
docker scout cves tmi/tmi-server:latest

Unauthorized Access Attempt

Response:

# 1. Identify source
grep "authorization denied" /var/log/tmi/tmi.log | tail -20

# 2. Block IP if necessary
# Via firewall
sudo ufw deny from <ip-address>

# Via nginx
# Add to /etc/nginx/conf.d/blacklist.conf
deny <ip-address>;
sudo nginx -s reload

# 3. Review access logs
grep "<ip-address>" /var/log/nginx/access.log

# 4. Check for compromised accounts
grep "user_email.*<suspect-email>" /var/log/tmi/tmi.log

Security Contacts

Establish clear communication channels:

Security Auditing

Regular Security Audits

Weekly:

  • Review authentication logs
  • Check failed login attempts
  • Monitor container vulnerability reports
  • Review access patterns

Monthly:

  • Full security scan (containers, dependencies)
  • Certificate expiration check
  • Access control review
  • Security event trends analysis

Quarterly:

  • Security policy review
  • Penetration testing
  • Disaster recovery testing
  • Security training updates

Compliance and Logging

Maintain audit logs for compliance:

# Enable comprehensive logging
LOGGING_LOG_API_REQUESTS=true
LOGGING_LOG_API_RESPONSES=false  # May expose sensitive data
LOGGING_REDACT_AUTH_TOKENS=true  # Always redact

# Archive logs for retention
tar -czf /archive/tmi-logs-$(date +%Y%m).tar.gz /var/log/tmi/

Retention requirements:

  • Authentication logs: 90 days minimum
  • Access logs: 90 days minimum
  • Security events: 1 year minimum
  • Audit trails: Per compliance requirements

Related Documentation

Additional Resources

Clone this wiki locally