From 5fe0f5c54bcb6d5fd5986c776fee15cfdd810fc4 Mon Sep 17 00:00:00 2001 From: Akinyemi Arabambi Date: Wed, 12 Nov 2025 14:57:00 +0000 Subject: [PATCH 1/3] Add comprehensive documentation for alerts, database schema, deployment, detection rules, and user management - Introduced ALERTS.md detailing alert types, notification channels, alert rules, management, tuning, integration patterns, troubleshooting, and API reference. - Added DATABASE.md outlining the database schema, core tables, security tables, indexes, data relationships, retention policies, security features, backup strategies, monitoring, maintenance, and troubleshooting. - Created DEPLOYMENT.md covering production architecture, Docker setup, Kubernetes deployment, security hardening, monitoring, backup strategies, performance optimization, maintenance procedures, troubleshooting, and a security checklist. - Expanded DETECTION_RULES.md with detailed rule types, configuration, management, testing, risk scoring, best practices, troubleshooting, and API reference. - Updated USER_MANAGEMENT.md to include user roles, management procedures, security best practices, API access, troubleshooting, and audit compliance. --- docs/ALERTS.md | 313 +++++++++++++++++++++++ docs/DATABASE.md | 391 ++++++++++++++++++++++++++++ docs/DEPLOYMENT.md | 553 ++++++++++++++++++++++++++++++++++++++++ docs/DETECTION_RULES.md | 232 +++++++++++++++++ docs/USER_MANAGEMENT.md | 134 ++++++++++ 5 files changed, 1623 insertions(+) create mode 100644 docs/ALERTS.md create mode 100644 docs/DATABASE.md create mode 100644 docs/DEPLOYMENT.md create mode 100644 docs/DETECTION_RULES.md create mode 100644 docs/USER_MANAGEMENT.md diff --git a/docs/ALERTS.md b/docs/ALERTS.md new file mode 100644 index 0000000..013656f --- /dev/null +++ b/docs/ALERTS.md @@ -0,0 +1,313 @@ +# Alert Setup + +FlagWise provides real-time alerting capabilities to notify security teams when risky LLM activity is detected. Configure alerts to integrate with your existing incident response workflows. + +## Alert Types + +### Detection Rule Alerts +Triggered when specific detection rules are matched: +- High-risk prompts containing sensitive data +- Unauthorized model usage +- Suspicious activity patterns +- Policy violations + +### Threshold Alerts +Triggered when metrics exceed defined limits: +- Request volume spikes +- High average risk scores +- Unusual geographic activity +- Token consumption limits + +### System Alerts +Triggered by system events: +- Service health issues +- Database connectivity problems +- Processing errors +- Configuration changes + +## Notification Channels + +### Slack Integration +Send alerts to Slack channels for team collaboration. + +#### Setup +1. Create a Slack webhook URL in your workspace +2. Navigate to **Settings → Alerts** +3. Click **Add Notification Channel** +4. Select **Slack** and enter webhook URL +5. Test the connection + +#### Configuration +```json +{ + "channel_type": "slack", + "webhook_url": "https://hooks.slack.com/services/YOUR/WEBHOOK/URL", + "channel": "#security-alerts", + "username": "FlagWise", + "icon_emoji": ":warning:" +} +``` + +### Email Notifications +*Coming Soon* - Email integration for alert delivery. + +### Webhook Integration +Send alerts to custom endpoints for integration with SIEM systems. + +```json +{ + "channel_type": "webhook", + "url": "https://your-siem.company.com/api/alerts", + "method": "POST", + "headers": { + "Authorization": "Bearer YOUR_API_TOKEN", + "Content-Type": "application/json" + } +} +``` + +## Alert Rules + +### Creating Alert Rules + +1. Navigate to **Settings → Alerts** +2. Click **Add Alert Rule** +3. Configure rule settings: + - **Name**: Descriptive identifier + - **Type**: Detection rule or threshold-based + - **Severity**: Critical, high, medium, or low + - **Conditions**: Trigger criteria + - **Notifications**: Which channels to notify + +### Detection Rule Alerts + +Link alerts to specific detection rules: + +```json +{ + "name": "Critical Data Exposure", + "rule_type": "detection_rule", + "severity": "critical", + "detection_rule_ids": ["rule-uuid-1", "rule-uuid-2"], + "notifications": { + "slack": { + "enabled": true, + "channel": "#security-critical" + } + } +} +``` + +### Threshold Alerts + +Monitor system metrics and activity levels: + +```json +{ + "name": "High Risk Activity", + "rule_type": "threshold", + "severity": "high", + "threshold_config": { + "metric": "avg_risk_score", + "operator": "greater_than", + "value": 70, + "time_window": "5m", + "min_requests": 10 + } +} +``` + +## Alert Management + +### Alert Dashboard +View and manage active alerts: +- **New**: Recently triggered alerts requiring attention +- **Acknowledged**: Alerts being investigated +- **Resolved**: Completed investigations + +### Alert Actions +- **Acknowledge**: Mark alert as being investigated +- **Resolve**: Close alert after addressing the issue +- **Archive**: Remove from active view (keeps history) +- **Bulk Operations**: Handle multiple alerts simultaneously + +### Alert Details +Each alert includes: +- Trigger timestamp and conditions +- Related LLM requests and sessions +- Risk assessment and context +- Investigation notes and actions taken +- Resolution status and timeline + +## Configuration Examples + +### High-Risk Prompt Alert +```json +{ + "name": "Sensitive Data Detected", + "rule_type": "detection_rule", + "severity": "critical", + "detection_rule_ids": ["credit-card-rule", "ssn-rule"], + "notifications": { + "slack": { + "enabled": true, + "channel": "#security-alerts", + "mention_users": ["@security-team"] + } + } +} +``` + +### Volume Spike Alert +```json +{ + "name": "Request Volume Spike", + "rule_type": "threshold", + "severity": "medium", + "threshold_config": { + "metric": "request_count", + "operator": "greater_than", + "value": 1000, + "time_window": "1h", + "comparison": "previous_hour" + } +} +``` + +### Model Restriction Alert +```json +{ + "name": "Unauthorized Model Usage", + "rule_type": "detection_rule", + "severity": "high", + "detection_rule_ids": ["model-restriction-rule"], + "notifications": { + "webhook": { + "enabled": true, + "url": "https://siem.company.com/api/alerts" + } + } +} +``` + +## Alert Tuning + +### Reducing False Positives +- Adjust detection rule sensitivity +- Implement alert suppression rules +- Use time-based conditions +- Add context-aware filtering + +### Alert Fatigue Prevention +- Prioritize critical alerts +- Group related alerts together +- Implement escalation policies +- Regular review and tuning + +### Performance Optimization +- Limit alert frequency per rule +- Use efficient threshold calculations +- Batch notifications when possible +- Monitor alert processing latency + +## Integration Patterns + +### SIEM Integration +Forward alerts to Security Information and Event Management systems: + +```bash +# Example webhook payload +{ + "timestamp": "2024-01-15T10:30:00Z", + "severity": "critical", + "source": "flagwise", + "event_type": "llm_security_alert", + "details": { + "rule_name": "Credit Card Detection", + "risk_score": 75, + "src_ip": "192.168.1.100", + "model": "gpt-4", + "request_id": "req-uuid-123" + } +} +``` + +### Incident Response Integration +Connect with ticketing systems: +- Automatically create tickets for critical alerts +- Include relevant context and investigation links +- Update ticket status based on alert resolution + +### Monitoring Integration +Send metrics to monitoring platforms: +- Alert frequency and response times +- False positive rates +- System health indicators +- User activity patterns + +## Troubleshooting + +### Alerts Not Triggering +- Verify alert rule is active +- Check detection rule configuration +- Confirm notification channels are working +- Review alert processing logs + +### Missing Notifications +- Test webhook/Slack connectivity +- Verify authentication credentials +- Check rate limiting and quotas +- Review notification channel settings + +### Performance Issues +- Monitor alert processing latency +- Optimize threshold calculations +- Reduce notification frequency +- Check system resource usage + +## API Reference + +### List Alerts +```bash +GET /api/alerts?status=new&severity=critical +``` + +### Create Alert Rule +```bash +POST /api/alert-rules +Content-Type: application/json + +{ + "name": "New Alert Rule", + "rule_type": "threshold", + "severity": "high", + "threshold_config": { + "metric": "risk_score", + "operator": "greater_than", + "value": 80 + } +} +``` + +### Acknowledge Alert +```bash +PUT /api/alerts/{alert_id} +Content-Type: application/json + +{ + "status": "acknowledged", + "acknowledged_by": "security_analyst" +} +``` + +### Test Notification Channel +```bash +POST /api/notifications/test +Content-Type: application/json + +{ + "channel_type": "slack", + "webhook_url": "https://hooks.slack.com/...", + "test_message": "FlagWise alert test" +} +``` \ No newline at end of file diff --git a/docs/DATABASE.md b/docs/DATABASE.md new file mode 100644 index 0000000..c9b42eb --- /dev/null +++ b/docs/DATABASE.md @@ -0,0 +1,391 @@ +# Database Schema + +FlagWise uses PostgreSQL 15+ for data persistence and analytics. This document describes the database structure, relationships, and optimization strategies. + +## Core Tables + +### llm_requests +Primary table storing all intercepted LLM API calls. + +```sql +CREATE TABLE llm_requests ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(), + src_ip TEXT NOT NULL, + provider TEXT NOT NULL, + model TEXT NOT NULL, + endpoint TEXT, + method TEXT DEFAULT 'POST', + prompt TEXT NOT NULL, + response TEXT, + headers JSONB, + tokens_prompt INTEGER, + tokens_response INTEGER, + tokens_total INTEGER, + duration_ms INTEGER, + status_code INTEGER, + risk_score INTEGER DEFAULT 0, + is_flagged BOOLEAN DEFAULT FALSE, + flag_reason TEXT, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); +``` + +**Key Fields:** +- `id`: UUID primary key for security +- `timestamp`: When the request occurred +- `src_ip`: Internal IP address of requestor +- `provider`: openai, anthropic, etc. +- `model`: gpt-4, claude-3, etc. +- `prompt`: User prompt (encrypted in production) +- `response`: LLM response (encrypted in production) +- `risk_score`: 0-100 calculated risk score +- `is_flagged`: Whether request triggered detection rules +- `flag_reason`: Comma-separated list of triggered rule names + +### detection_rules +Configurable rules for flagging risky requests. + +```sql +CREATE TABLE detection_rules ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + name TEXT NOT NULL UNIQUE, + description TEXT, + category TEXT NOT NULL CHECK (category IN ('data_privacy', 'security', 'compliance')), + rule_type TEXT NOT NULL CHECK (rule_type IN ('keyword', 'regex', 'model_restriction', 'custom_scoring')), + pattern TEXT NOT NULL, + severity TEXT NOT NULL CHECK (severity IN ('critical', 'high', 'medium', 'low')), + points INTEGER NOT NULL CHECK (points >= 0 AND points <= 100), + priority INTEGER DEFAULT 0, + stop_on_match BOOLEAN DEFAULT FALSE, + combination_logic TEXT DEFAULT 'AND' CHECK (combination_logic IN ('AND', 'OR')), + is_active BOOLEAN DEFAULT TRUE, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); +``` + +### alerts +Tracks notifications sent for flagged requests. + +```sql +CREATE TABLE alerts ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + title TEXT NOT NULL, + description TEXT, + severity TEXT NOT NULL CHECK (severity IN ('critical', 'high', 'medium', 'low')), + alert_type TEXT NOT NULL, + status TEXT DEFAULT 'new' CHECK (status IN ('new', 'acknowledged', 'resolved')), + source_type TEXT NOT NULL CHECK (source_type IN ('detection_rule', 'threshold', 'system')), + source_id UUID, + related_request_id UUID REFERENCES llm_requests(id), + metadata JSONB, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW(), + acknowledged_at TIMESTAMPTZ, + resolved_at TIMESTAMPTZ, + acknowledged_by TEXT, + resolved_by TEXT +); +``` + +### user_sessions +Groups requests by IP address and time windows for session analysis. + +```sql +CREATE TABLE user_sessions ( + id TEXT PRIMARY KEY, -- Generated from src_ip + start_time + src_ip TEXT NOT NULL, + start_time TIMESTAMPTZ NOT NULL, + end_time TIMESTAMPTZ NOT NULL, + duration_minutes INTEGER NOT NULL, + request_count INTEGER NOT NULL, + avg_risk_score FLOAT NOT NULL, + flagged_count INTEGER DEFAULT 0, + geographic_info TEXT, + user_agent TEXT, + top_providers TEXT[], + top_models TEXT[], + risk_level TEXT CHECK (risk_level IN ('critical', 'high', 'medium', 'low')), + unusual_patterns TEXT[], + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); +``` + +## Security Tables + +### users +User accounts and authentication. + +```sql +CREATE TABLE users ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + username TEXT NOT NULL UNIQUE, + hashed_password TEXT NOT NULL, + first_name TEXT DEFAULT '', + last_name TEXT DEFAULT '', + role TEXT NOT NULL CHECK (role IN ('admin', 'read_only')), + is_active BOOLEAN DEFAULT TRUE, + last_login TIMESTAMPTZ, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); +``` + +### alert_rules +Configuration for automated alert generation. + +```sql +CREATE TABLE alert_rules ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + name TEXT NOT NULL UNIQUE, + description TEXT, + rule_type TEXT NOT NULL CHECK (rule_type IN ('threshold', 'detection_rule')), + is_active BOOLEAN DEFAULT TRUE, + severity TEXT NOT NULL CHECK (severity IN ('critical', 'high', 'medium', 'low')), + threshold_config JSONB, + detection_rule_ids UUID[], + notifications JSONB, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); +``` + +### system_settings +Application configuration and feature flags. + +```sql +CREATE TABLE system_settings ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + key TEXT NOT NULL UNIQUE, + value TEXT NOT NULL, + description TEXT, + category TEXT NOT NULL, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW() +); +``` + +## Indexes and Performance + +### Primary Indexes +```sql +-- Time-based queries (dashboard views) +CREATE INDEX idx_llm_requests_timestamp ON llm_requests(timestamp DESC); +CREATE INDEX idx_llm_requests_created_at ON llm_requests(created_at DESC); + +-- IP-based filtering (user tracking) +CREATE INDEX idx_llm_requests_src_ip ON llm_requests(src_ip); +CREATE INDEX idx_user_sessions_src_ip ON user_sessions(src_ip); + +-- Provider/model filtering (analytics) +CREATE INDEX idx_llm_requests_provider ON llm_requests(provider); +CREATE INDEX idx_llm_requests_model ON llm_requests(model); +CREATE INDEX idx_llm_requests_provider_model ON llm_requests(provider, model); + +-- Flagged requests (security review) +CREATE INDEX idx_llm_requests_flagged ON llm_requests(is_flagged) WHERE is_flagged = TRUE; +CREATE INDEX idx_llm_requests_risk_score ON llm_requests(risk_score DESC); + +-- Alert management +CREATE INDEX idx_alerts_status ON alerts(status); +CREATE INDEX idx_alerts_severity ON alerts(severity); +CREATE INDEX idx_alerts_created_at ON alerts(created_at DESC); +``` + +### Composite Indexes +```sql +-- Dashboard queries +CREATE INDEX idx_llm_requests_timestamp_flagged ON llm_requests(timestamp DESC, is_flagged); +CREATE INDEX idx_llm_requests_timestamp_provider ON llm_requests(timestamp DESC, provider); + +-- Session analysis +CREATE INDEX idx_user_sessions_time_range ON user_sessions(start_time, end_time); +CREATE INDEX idx_user_sessions_risk_level ON user_sessions(risk_level); +``` + +### Partial Indexes +```sql +-- High-risk requests only +CREATE INDEX idx_llm_requests_high_risk ON llm_requests(timestamp DESC) +WHERE risk_score >= 70; + +-- Active rules only +CREATE INDEX idx_detection_rules_active ON detection_rules(priority DESC) +WHERE is_active = TRUE; +``` + +## Data Relationships + +### Entity Relationships +``` +llm_requests (1) ←→ (0..n) alerts +detection_rules (1) ←→ (0..n) alert_rules +users (1) ←→ (0..n) alerts (acknowledged_by, resolved_by) +user_sessions (1) ←→ (0..n) llm_requests (via src_ip + time range) +``` + +### Foreign Key Constraints +```sql +ALTER TABLE alerts ADD CONSTRAINT fk_alerts_request +FOREIGN KEY (related_request_id) REFERENCES llm_requests(id); + +ALTER TABLE alert_rules ADD CONSTRAINT fk_alert_rules_detection_rules +FOREIGN KEY (detection_rule_ids) REFERENCES detection_rules(id); +``` + +## Data Retention and Cleanup + +### Automatic Cleanup Function +```sql +CREATE OR REPLACE FUNCTION cleanup_old_data() +RETURNS void AS $$ +BEGIN + -- Remove requests older than 6 months + DELETE FROM llm_requests + WHERE created_at < NOW() - INTERVAL '6 months'; + + -- Remove resolved alerts older than 3 months + DELETE FROM alerts + WHERE status = 'resolved' + AND resolved_at < NOW() - INTERVAL '3 months'; + + -- Remove old sessions + DELETE FROM user_sessions + WHERE end_time < NOW() - INTERVAL '6 months'; +END; +$$ LANGUAGE plpgsql; +``` + +### Scheduled Cleanup +```sql +-- Run cleanup monthly +SELECT cron.schedule('cleanup-old-data', '0 2 1 * *', 'SELECT cleanup_old_data();'); +``` + +## Security Features + +### Row-Level Security (Future) +```sql +-- Enable RLS for sensitive tables +ALTER TABLE llm_requests ENABLE ROW LEVEL SECURITY; + +-- Admin users see all data +CREATE POLICY admin_all_access ON llm_requests +FOR ALL TO admin_role USING (true); + +-- Read-only users see limited data +CREATE POLICY readonly_limited_access ON llm_requests +FOR SELECT TO readonly_role +USING (created_at > NOW() - INTERVAL '30 days'); +``` + +### Column-Level Encryption (Future) +```sql +-- Encrypt sensitive columns +ALTER TABLE llm_requests +ADD COLUMN prompt_encrypted BYTEA, +ADD COLUMN response_encrypted BYTEA; +``` + +## Backup and Recovery + +### Backup Strategy +```bash +# Daily full backup +pg_dump -h localhost -U flagwise_user -d flagwise > backup_$(date +%Y%m%d).sql + +# Continuous WAL archiving +archive_mode = on +archive_command = 'cp %p /backup/wal/%f' +``` + +### Point-in-Time Recovery +```bash +# Restore to specific timestamp +pg_restore -h localhost -U flagwise_user -d flagwise_restored backup.sql +``` + +## Monitoring and Maintenance + +### Database Statistics +```sql +-- Table sizes and row counts +SELECT + schemaname, + tablename, + pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size, + pg_stat_get_tuples_returned(c.oid) as rows +FROM pg_tables t +JOIN pg_class c ON c.relname = t.tablename +WHERE schemaname = 'public' +ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC; +``` + +### Performance Monitoring +```sql +-- Slow queries +SELECT query, mean_time, calls, total_time +FROM pg_stat_statements +WHERE mean_time > 1000 +ORDER BY mean_time DESC; + +-- Index usage +SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch +FROM pg_stat_user_indexes +ORDER BY idx_scan DESC; +``` + +### Maintenance Tasks +```sql +-- Update table statistics +ANALYZE llm_requests; + +-- Rebuild indexes if needed +REINDEX INDEX idx_llm_requests_timestamp; + +-- Vacuum to reclaim space +VACUUM ANALYZE llm_requests; +``` + +## Migration Scripts + +### Schema Updates +```sql +-- Add new column with default value +ALTER TABLE llm_requests +ADD COLUMN new_field TEXT DEFAULT 'default_value'; + +-- Create new index concurrently +CREATE INDEX CONCURRENTLY idx_new_field ON llm_requests(new_field); +``` + +### Data Migration +```sql +-- Migrate existing data +UPDATE llm_requests +SET new_field = 'migrated_value' +WHERE new_field = 'default_value'; +``` + +## Troubleshooting + +### Common Issues +- **Slow queries**: Check index usage and query plans +- **Lock contention**: Monitor pg_locks table +- **Storage growth**: Implement data retention policies +- **Connection limits**: Tune max_connections setting + +### Diagnostic Queries +```sql +-- Current connections +SELECT * FROM pg_stat_activity WHERE state = 'active'; + +-- Lock information +SELECT * FROM pg_locks WHERE NOT granted; + +-- Database size +SELECT pg_size_pretty(pg_database_size('flagwise')); +``` \ No newline at end of file diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md new file mode 100644 index 0000000..eb55560 --- /dev/null +++ b/docs/DEPLOYMENT.md @@ -0,0 +1,553 @@ +# Deployment Guide + +This guide covers production deployment strategies for FlagWise, including security hardening, scaling considerations, and operational best practices. + +## Production Architecture + +### Recommended Architecture +``` +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ +│ Load Balancer │ │ Web Frontend │ │ API Backend │ +│ (nginx/ALB) │────│ (React App) │────│ (FastAPI) │ +└─────────────────┘ └─────────────────┘ └─────────────────┘ + │ +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ +│ Kafka Cluster │ │ PostgreSQL │ │ Redis Cache │ +│ (Data Source) │ │ (Primary DB) │ │ (Sessions) │ +└─────────────────┘ └─────────────────┘ └─────────────────┘ +``` + +### Component Scaling +- **Web Frontend**: Stateless, can run multiple instances +- **API Backend**: Horizontally scalable with load balancing +- **Database**: Single primary with read replicas +- **Kafka Consumer**: Can run multiple instances for throughput + +## Docker Production Setup + +### Production Docker Compose +Create `docker-compose.prod.yml`: + +```yaml +version: '3.8' + +services: + web: + build: + context: ./services/web + dockerfile: Dockerfile.prod + ports: + - "80:80" + - "443:443" + environment: + - NODE_ENV=production + - REACT_APP_API_URL=https://api.yourdomain.com + volumes: + - ./ssl:/etc/ssl/certs + depends_on: + - api + + api: + build: + context: ./services/api + dockerfile: Dockerfile.prod + ports: + - "8000:8000" + environment: + - ENVIRONMENT=production + - DATABASE_URL=postgresql://user:pass@db:5432/flagwise + - JWT_SECRET_KEY=${JWT_SECRET_KEY} + - ENCRYPTION_KEY=${ENCRYPTION_KEY} + depends_on: + - db + - redis + volumes: + - ./logs:/app/logs + + consumer: + build: + context: ./services/consumer + dockerfile: Dockerfile.prod + environment: + - ENVIRONMENT=production + - DATABASE_URL=postgresql://user:pass@db:5432/flagwise + - KAFKA_BOOTSTRAP_SERVERS=kafka:9092 + depends_on: + - db + - kafka + + db: + image: postgres:15-alpine + environment: + - POSTGRES_DB=flagwise + - POSTGRES_USER=${DB_USER} + - POSTGRES_PASSWORD=${DB_PASSWORD} + volumes: + - postgres_data:/var/lib/postgresql/data + - ./database/init.sql:/docker-entrypoint-initdb.d/init.sql + ports: + - "5432:5432" + + redis: + image: redis:7-alpine + command: redis-server --requirepass ${REDIS_PASSWORD} + volumes: + - redis_data:/data + + kafka: + image: confluentinc/cp-kafka:latest + environment: + KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 + KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 + KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 + depends_on: + - zookeeper + + zookeeper: + image: confluentinc/cp-zookeeper:latest + environment: + ZOOKEEPER_CLIENT_PORT: 2181 + ZOOKEEPER_TICK_TIME: 2000 + +volumes: + postgres_data: + redis_data: +``` + +### Environment Configuration +Create `.env.prod`: + +```bash +# Database +DB_USER=flagwise_prod +DB_PASSWORD=secure_random_password_here +DATABASE_URL=postgresql://flagwise_prod:secure_password@db:5432/flagwise + +# Security +JWT_SECRET_KEY=your_jwt_secret_key_here +ENCRYPTION_KEY=your_32_byte_encryption_key_here +REDIS_PASSWORD=secure_redis_password + +# Application +ENVIRONMENT=production +LOG_LEVEL=INFO +CORS_ORIGINS=https://yourdomain.com + +# Kafka +KAFKA_BOOTSTRAP_SERVERS=kafka:9092 +KAFKA_TOPIC=llm_requests + +# Monitoring +SENTRY_DSN=your_sentry_dsn_here +``` + +## Kubernetes Deployment + +### Namespace and ConfigMap +```yaml +apiVersion: v1 +kind: Namespace +metadata: + name: flagwise + +--- +apiVersion: v1 +kind: ConfigMap +metadata: + name: flagwise-config + namespace: flagwise +data: + DATABASE_URL: "postgresql://user:pass@postgres:5432/flagwise" + KAFKA_BOOTSTRAP_SERVERS: "kafka:9092" + ENVIRONMENT: "production" +``` + +### API Deployment +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: flagwise-api + namespace: flagwise +spec: + replicas: 3 + selector: + matchLabels: + app: flagwise-api + template: + metadata: + labels: + app: flagwise-api + spec: + containers: + - name: api + image: flagwise/api:latest + ports: + - containerPort: 8000 + envFrom: + - configMapRef: + name: flagwise-config + - secretRef: + name: flagwise-secrets + resources: + requests: + memory: "512Mi" + cpu: "250m" + limits: + memory: "1Gi" + cpu: "500m" + livenessProbe: + httpGet: + path: /health + port: 8000 + initialDelaySeconds: 30 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /health + port: 8000 + initialDelaySeconds: 5 + periodSeconds: 5 + +--- +apiVersion: v1 +kind: Service +metadata: + name: flagwise-api-service + namespace: flagwise +spec: + selector: + app: flagwise-api + ports: + - port: 8000 + targetPort: 8000 + type: ClusterIP +``` + +### Web Frontend Deployment +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: flagwise-web + namespace: flagwise +spec: + replicas: 2 + selector: + matchLabels: + app: flagwise-web + template: + metadata: + labels: + app: flagwise-web + spec: + containers: + - name: web + image: flagwise/web:latest + ports: + - containerPort: 80 + resources: + requests: + memory: "256Mi" + cpu: "100m" + limits: + memory: "512Mi" + cpu: "200m" + +--- +apiVersion: v1 +kind: Service +metadata: + name: flagwise-web-service + namespace: flagwise +spec: + selector: + app: flagwise-web + ports: + - port: 80 + targetPort: 80 + type: LoadBalancer +``` + +## Security Hardening + +### SSL/TLS Configuration +```nginx +server { + listen 443 ssl http2; + server_name yourdomain.com; + + ssl_certificate /etc/ssl/certs/yourdomain.crt; + ssl_certificate_key /etc/ssl/private/yourdomain.key; + ssl_protocols TLSv1.2 TLSv1.3; + ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512; + ssl_prefer_server_ciphers off; + + location / { + proxy_pass http://flagwise-web:80; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + } + + location /api/ { + proxy_pass http://flagwise-api:8000/; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + } +} +``` + +### Database Security +```sql +-- Create dedicated database user +CREATE USER flagwise_prod WITH PASSWORD 'secure_password'; +GRANT CONNECT ON DATABASE flagwise TO flagwise_prod; +GRANT USAGE ON SCHEMA public TO flagwise_prod; +GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO flagwise_prod; + +-- Enable SSL connections only +ALTER SYSTEM SET ssl = on; +ALTER SYSTEM SET ssl_cert_file = 'server.crt'; +ALTER SYSTEM SET ssl_key_file = 'server.key'; +``` + +### Network Security +```yaml +# Network policies for Kubernetes +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: flagwise-network-policy + namespace: flagwise +spec: + podSelector: {} + policyTypes: + - Ingress + - Egress + ingress: + - from: + - namespaceSelector: + matchLabels: + name: ingress-nginx + ports: + - protocol: TCP + port: 80 + - protocol: TCP + port: 8000 + egress: + - to: + - namespaceSelector: + matchLabels: + name: kube-system + ports: + - protocol: TCP + port: 53 + - protocol: UDP + port: 53 +``` + +## Monitoring and Observability + +### Health Checks +```python +# API health endpoint +@app.get("/health") +async def health_check(): + return { + "status": "healthy", + "timestamp": datetime.utcnow(), + "version": "1.0.0", + "database_connected": await check_database_connection(), + "kafka_connected": await check_kafka_connection() + } +``` + +### Logging Configuration +```yaml +# Fluentd configuration for log aggregation +apiVersion: v1 +kind: ConfigMap +metadata: + name: fluentd-config +data: + fluent.conf: | + + @type tail + path /var/log/containers/flagwise-*.log + pos_file /var/log/fluentd-containers.log.pos + tag kubernetes.* + format json + + + + @type elasticsearch + host elasticsearch.logging.svc.cluster.local + port 9200 + index_name flagwise-logs + +``` + +### Metrics Collection +```python +# Prometheus metrics +from prometheus_client import Counter, Histogram, Gauge + +request_count = Counter('flagwise_requests_total', 'Total requests processed') +request_duration = Histogram('flagwise_request_duration_seconds', 'Request duration') +active_sessions = Gauge('flagwise_active_sessions', 'Number of active sessions') +``` + +## Backup and Disaster Recovery + +### Database Backup Strategy +```bash +#!/bin/bash +# Automated backup script + +BACKUP_DIR="/backups/flagwise" +DATE=$(date +%Y%m%d_%H%M%S) +DB_NAME="flagwise" + +# Create backup directory +mkdir -p $BACKUP_DIR + +# Full database backup +pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > $BACKUP_DIR/flagwise_$DATE.sql + +# Compress backup +gzip $BACKUP_DIR/flagwise_$DATE.sql + +# Upload to S3 +aws s3 cp $BACKUP_DIR/flagwise_$DATE.sql.gz s3://flagwise-backups/ + +# Cleanup old backups (keep 30 days) +find $BACKUP_DIR -name "*.sql.gz" -mtime +30 -delete +``` + +### Disaster Recovery Plan +1. **Database Recovery**: Restore from latest backup +2. **Application Recovery**: Redeploy from container registry +3. **Data Validation**: Verify data integrity post-recovery +4. **Service Verification**: Run health checks and smoke tests + +## Performance Optimization + +### Database Tuning +```sql +-- PostgreSQL configuration for production +shared_buffers = 256MB +effective_cache_size = 1GB +maintenance_work_mem = 64MB +checkpoint_completion_target = 0.9 +wal_buffers = 16MB +default_statistics_target = 100 +random_page_cost = 1.1 +effective_io_concurrency = 200 +``` + +### Application Scaling +```yaml +# Horizontal Pod Autoscaler +apiVersion: autoscaling/v2 +kind: HorizontalPodAutoscaler +metadata: + name: flagwise-api-hpa + namespace: flagwise +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: flagwise-api + minReplicas: 3 + maxReplicas: 10 + metrics: + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: 70 + - type: Resource + resource: + name: memory + target: + type: Utilization + averageUtilization: 80 +``` + +## Maintenance Procedures + +### Rolling Updates +```bash +# Update API deployment +kubectl set image deployment/flagwise-api api=flagwise/api:v1.1.0 -n flagwise + +# Monitor rollout +kubectl rollout status deployment/flagwise-api -n flagwise + +# Rollback if needed +kubectl rollout undo deployment/flagwise-api -n flagwise +``` + +### Database Maintenance +```sql +-- Weekly maintenance tasks +VACUUM ANALYZE; +REINDEX DATABASE flagwise; +UPDATE pg_stat_statements_reset(); +``` + +### Log Rotation +```bash +# Logrotate configuration +/var/log/flagwise/*.log { + daily + rotate 30 + compress + delaycompress + missingok + notifempty + create 644 flagwise flagwise + postrotate + systemctl reload flagwise-api + endscript +} +``` + +## Troubleshooting + +### Common Issues +- **High CPU usage**: Check for inefficient queries, scale horizontally +- **Memory leaks**: Monitor application metrics, restart if needed +- **Database locks**: Identify and terminate long-running queries +- **Network connectivity**: Verify service discovery and DNS resolution + +### Diagnostic Commands +```bash +# Check pod status +kubectl get pods -n flagwise + +# View logs +kubectl logs -f deployment/flagwise-api -n flagwise + +# Check resource usage +kubectl top pods -n flagwise + +# Database connections +kubectl exec -it postgres-pod -- psql -c "SELECT * FROM pg_stat_activity;" +``` + +## Security Checklist + +- [ ] SSL/TLS certificates configured and valid +- [ ] Database connections encrypted +- [ ] Secrets stored in secure secret management +- [ ] Network policies implemented +- [ ] Regular security updates applied +- [ ] Access logs monitored +- [ ] Backup encryption enabled +- [ ] Vulnerability scanning automated \ No newline at end of file diff --git a/docs/DETECTION_RULES.md b/docs/DETECTION_RULES.md new file mode 100644 index 0000000..907ad0f --- /dev/null +++ b/docs/DETECTION_RULES.md @@ -0,0 +1,232 @@ +# Detection Rules + +Detection rules are the core of FlagWise's security monitoring. They analyze LLM traffic in real-time to identify risky patterns, unauthorized usage, and potential security threats. + +## Rule Types + +### Keyword Rules +Exact string matching for sensitive terms. + +```json +{ + "name": "Sensitive Keywords", + "rule_type": "keyword", + "pattern": "password|secret|api_key|token", + "severity": "high", + "points": 50 +} +``` + +### Regex Rules +Pattern matching using regular expressions. + +```json +{ + "name": "Credit Card Detection", + "rule_type": "regex", + "pattern": "\\b(?:\\d{4}[-\\s]?){3}\\d{4}\\b", + "severity": "critical", + "points": 75 +} +``` + +### Model Restriction Rules +Block specific AI models or providers. + +```json +{ + "name": "Unauthorized Models", + "rule_type": "model_restriction", + "pattern": "gpt-4|claude-3", + "severity": "medium", + "points": 30 +} +``` + +### Custom Scoring Rules +Advanced logic for complex threat detection. + +```json +{ + "name": "High Token Usage", + "rule_type": "custom_scoring", + "pattern": "tokens_total > 2000", + "severity": "low", + "points": 25 +} +``` + +## Rule Configuration + +### Basic Settings +- **Name**: Descriptive rule identifier +- **Description**: Detailed explanation of the rule's purpose +- **Category**: `data_privacy`, `security`, or `compliance` +- **Pattern**: The detection pattern (keyword, regex, etc.) +- **Severity**: `critical`, `high`, `medium`, or `low` +- **Points**: Risk score contribution (0-100) + +### Advanced Settings +- **Priority**: Execution order (0-1000, higher = earlier) +- **Stop on Match**: Halt processing after this rule triggers +- **Combination Logic**: `AND` or `OR` for multi-pattern rules +- **Active Status**: Enable/disable rule without deletion + +## Default Rules + +FlagWise includes pre-configured rules for common security patterns: + +| Rule Name | Type | Pattern | Points | Description | +|-----------|------|---------|--------|-------------| +| Critical Keywords | keyword | password\|secret\|api_key | 50 | Sensitive authentication data | +| Email Addresses | regex | `\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z\|a-z]{2,}\b` | 30 | Personal identifiable information | +| Credit Cards | regex | `\b(?:\d{4}[-\s]?){3}\d{4}\b` | 60 | Financial data patterns | +| IP Addresses | regex | `\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b` | 20 | Network infrastructure data | +| Phone Numbers | regex | `\b\d{3}-\d{3}-\d{4}\b` | 15 | Personal contact information | +| High Token Usage | custom_scoring | tokens_total > 2000 | 25 | Resource consumption limits | +| Restricted Models | model_restriction | unauthorized_model | 20 | Policy enforcement | + +## Managing Rules + +### Creating Rules + +1. Navigate to **Detection Rules** in the sidebar +2. Click **Add Rule** +3. Configure rule settings: + - Enter name and description + - Select category and rule type + - Define the detection pattern + - Set severity and point values +4. Test the rule with sample data +5. Save and activate + +### Editing Rules + +1. Find the rule in the Detection Rules list +2. Click the edit icon +3. Modify settings as needed +4. Test changes before saving +5. Update the rule + +### Bulk Operations + +Select multiple rules to: +- **Enable/Disable**: Toggle active status +- **Delete**: Remove rules permanently +- **Export**: Download rule configurations +- **Import**: Upload rule templates + +## Rule Testing + +### Pattern Validation +Test your patterns before deployment: + +```bash +# Test regex pattern +curl -X POST http://localhost:8000/api/rules/test \ + -H "Content-Type: application/json" \ + -d '{ + "pattern": "\\b(?:\\d{4}[-\\s]?){3}\\d{4}\\b", + "test_string": "My card number is 1234-5678-9012-3456" + }' +``` + +### Sample Data Testing +Use the built-in test interface: +1. Create or edit a rule +2. Click **Test Rule** +3. Enter sample prompt text +4. View matching results and score calculation + +## Risk Scoring + +### Point System +- Each rule contributes 0-100 points +- Total risk score is sum of triggered rules +- Scores above 70 typically indicate high risk +- Flagged requests require manual review + +### Severity Mapping +- **Critical**: 60-100 points (immediate attention) +- **High**: 40-59 points (priority review) +- **Medium**: 20-39 points (routine monitoring) +- **Low**: 1-19 points (informational) + +### Combination Logic +Rules can be combined using AND/OR logic: +- **AND**: All patterns must match +- **OR**: Any pattern triggers the rule +- **Priority**: Higher priority rules execute first +- **Stop on Match**: Prevents further rule processing + +## Best Practices + +### Rule Design +- Start with broad patterns, refine based on results +- Use descriptive names and detailed descriptions +- Test thoroughly before activating +- Monitor false positive rates + +### Performance Optimization +- Place high-priority rules first +- Use "Stop on Match" for definitive violations +- Avoid overly complex regex patterns +- Regular cleanup of unused rules + +### Maintenance +- Review rule effectiveness monthly +- Update patterns based on new threats +- Archive outdated rules instead of deleting +- Document rule changes and rationale + +## Troubleshooting + +### High False Positives +- Review and refine rule patterns +- Adjust point values to reduce sensitivity +- Add exclusion patterns for legitimate use cases +- Consider rule priority and combination logic + +### Missing Detections +- Verify rule is active and properly configured +- Check pattern syntax and test with known examples +- Ensure rule priority allows execution +- Review logs for processing errors + +### Performance Issues +- Optimize complex regex patterns +- Reduce number of active rules +- Use priority settings effectively +- Monitor system resource usage + +## API Reference + +### List Rules +```bash +GET /api/detection-rules +``` + +### Create Rule +```bash +POST /api/detection-rules +Content-Type: application/json + +{ + "name": "New Rule", + "category": "security", + "rule_type": "keyword", + "pattern": "sensitive_term", + "severity": "high", + "points": 50 +} +``` + +### Update Rule +```bash +PUT /api/detection-rules/{rule_id} +``` + +### Delete Rule +```bash +DELETE /api/detection-rules/{rule_id} +``` \ No newline at end of file diff --git a/docs/USER_MANAGEMENT.md b/docs/USER_MANAGEMENT.md new file mode 100644 index 0000000..0128db2 --- /dev/null +++ b/docs/USER_MANAGEMENT.md @@ -0,0 +1,134 @@ +# User Management + +FlagWise supports role-based access control with two user types: **Admin** and **Read-only**. This guide covers user account management, permissions, and security best practices. + +## User Roles + +### Admin Users +- Full system access and configuration +- Create, edit, and delete detection rules +- Manage user accounts and permissions +- Access sensitive data (full prompts and responses) +- Configure system settings and alerts +- Export data and generate reports + +### Read-only Users +- View dashboards and analytics +- Browse LLM requests (prompt previews only) +- View flagged prompts and alerts +- Access live feed and session data +- Cannot modify system configuration + +## Managing Users + +### Default Account +- **Username**: `admin` +- **Password**: `admin123` +- **⚠️ Change this password immediately after first login** + +### Creating New Users + +1. Navigate to **Settings → User Management** +2. Click **Add User** +3. Fill in user details: + - Username (3-50 characters, alphanumeric + hyphens/underscores) + - Password (minimum 6 characters) + - Role (Admin or Read-only) + - First/Last name (optional) +4. Click **Create User** + +### Editing Users + +1. Go to **Settings → User Management** +2. Click the edit icon next to a user +3. Modify role, active status, or personal information +4. Save changes + +### Password Management + +#### Self-Service Password Change +1. Click your profile icon in the top-right +2. Select **Change Password** +3. Enter current and new password +4. Confirm changes + +#### Admin Password Reset +Admins can reset any user's password: +1. Go to **Settings → User Management** +2. Click the reset icon next to a user +3. Enter new password +4. User will be notified to change it on next login + +### Deactivating Users + +Instead of deleting users (which removes audit trails): +1. Edit the user account +2. Set **Active Status** to **Disabled** +3. User can no longer log in but historical data remains + +## Security Best Practices + +### Password Requirements +- Minimum 6 characters (recommend 12+) +- Use unique passwords for each account +- Consider implementing password complexity rules +- Regular password rotation for admin accounts + +### Access Control +- Follow principle of least privilege +- Regularly review user permissions +- Remove access for departed team members +- Monitor user activity in audit logs + +### Session Management +- Sessions expire after inactivity +- Users are logged out when role changes +- Multiple concurrent sessions are allowed +- JWT tokens are used for authentication + +## API Access + +Users can access the FlagWise API using their credentials: + +```bash +# Login to get access token +curl -X POST http://localhost:8000/auth/login \ + -H "Content-Type: application/json" \ + -d '{"username": "your_username", "password": "your_password"}' + +# Use token in subsequent requests +curl -X GET http://localhost:8000/api/requests \ + -H "Authorization: Bearer YOUR_ACCESS_TOKEN" +``` + +## Troubleshooting + +### Cannot Login +- Verify username and password +- Check if account is active +- Ensure FlagWise services are running +- Check browser console for errors + +### Permission Denied +- Verify user role has required permissions +- Admin actions require Admin role +- Some data requires specific access levels + +### Forgot Password +- Contact an Admin user for password reset +- No self-service password recovery currently available +- Consider implementing email-based recovery + +## Audit and Compliance + +### User Activity Tracking +- All user actions are logged +- Login/logout events recorded +- Data access and modifications tracked +- Export capabilities for compliance reporting + +### Data Access Controls +- Read-only users see truncated prompts only +- Full prompt/response data requires Admin role +- IP-based access restrictions can be configured +- Session timeouts enforce security policies \ No newline at end of file From 080fef86f3ef3092cac61dbe6dabd930518c5622 Mon Sep 17 00:00:00 2001 From: Akinyemi Arabambi Date: Wed, 12 Nov 2025 15:07:58 +0000 Subject: [PATCH 2/3] Enhance documentation across multiple files: - Update alert user mention format in ALERTS.md - Clarify encryption status for prompt and response fields in DATABASE.md - Add security warnings and best practices in DEPLOYMENT.md - Correct regex formatting in DETECTION_RULES.md - Revise password requirements and recovery limitations in USER_MANAGEMENT.md --- docs/ALERTS.md | 7 ++++++- docs/DATABASE.md | 10 ++++++---- docs/DEPLOYMENT.md | 12 +++++++++++- docs/DETECTION_RULES.md | 2 +- docs/USER_MANAGEMENT.md | 13 ++++++++----- 5 files changed, 32 insertions(+), 12 deletions(-) diff --git a/docs/ALERTS.md b/docs/ALERTS.md index 013656f..8906255 100644 --- a/docs/ALERTS.md +++ b/docs/ALERTS.md @@ -48,6 +48,8 @@ Send alerts to Slack channels for team collaboration. } ``` +**Note**: For user mentions, use Slack user IDs in the format `<@USER_ID>` (e.g., `<@U1234567890>`) rather than @username. + ### Email Notifications *Coming Soon* - Email integration for alert delivery. @@ -152,7 +154,7 @@ Each alert includes: "slack": { "enabled": true, "channel": "#security-alerts", - "mention_users": ["@security-team"] + "mention_users": ["<@U1234567890>", "<@U0987654321>"] } } } @@ -301,9 +303,12 @@ Content-Type: application/json ``` ### Test Notification Channel +**Note**: Requires Admin authentication to prevent unauthorized testing of arbitrary endpoints. + ```bash POST /api/notifications/test Content-Type: application/json +Authorization: Bearer ADMIN_TOKEN { "channel_type": "slack", diff --git a/docs/DATABASE.md b/docs/DATABASE.md index c9b42eb..a8d0f13 100644 --- a/docs/DATABASE.md +++ b/docs/DATABASE.md @@ -38,8 +38,8 @@ CREATE TABLE llm_requests ( - `src_ip`: Internal IP address of requestor - `provider`: openai, anthropic, etc. - `model`: gpt-4, claude-3, etc. -- `prompt`: User prompt (encrypted in production) -- `response`: LLM response (encrypted in production) +- `prompt`: User prompt (currently plaintext, encryption planned) +- `response`: LLM response (currently plaintext, encryption planned) - `risk_score`: 0-100 calculated risk score - `is_flagged`: Whether request triggered detection rules - `flag_reason`: Comma-separated list of triggered rule names @@ -238,16 +238,18 @@ FOREIGN KEY (detection_rule_ids) REFERENCES detection_rules(id); ## Data Retention and Cleanup +**Note**: Default retention periods are 6 months for requests and 3 months for resolved alerts. Adjust these values based on your compliance requirements (GDPR, SOC2, etc.) and organizational data retention policies. + ### Automatic Cleanup Function ```sql CREATE OR REPLACE FUNCTION cleanup_old_data() RETURNS void AS $$ BEGIN - -- Remove requests older than 6 months + -- Remove requests older than 6 months (adjust as needed for compliance) DELETE FROM llm_requests WHERE created_at < NOW() - INTERVAL '6 months'; - -- Remove resolved alerts older than 3 months + -- Remove resolved alerts older than 3 months (adjust as needed for compliance) DELETE FROM alerts WHERE status = 'resolved' AND resolved_at < NOW() - INTERVAL '3 months'; diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md index eb55560..5d12fb1 100644 --- a/docs/DEPLOYMENT.md +++ b/docs/DEPLOYMENT.md @@ -115,6 +115,8 @@ volumes: ``` ### Environment Configuration +⚠️ **Security Warning**: Do NOT commit `.env.prod` to version control. For production, use Kubernetes Secrets, environment variable injection from secure vaults (AWS Secrets Manager, HashiCorp Vault), or container orchestration platform secrets management. + Create `.env.prod`: ```bash @@ -550,4 +552,12 @@ kubectl exec -it postgres-pod -- psql -c "SELECT * FROM pg_stat_activity;" - [ ] Regular security updates applied - [ ] Access logs monitored - [ ] Backup encryption enabled -- [ ] Vulnerability scanning automated \ No newline at end of file +- [ ] Vulnerability scanning automated +- [ ] Rate limiting configured +- [ ] CORS properly restricted +- [ ] Database connection pooling configured +- [ ] SQL injection prevention verified +- [ ] Security headers configured (CSP, X-Frame-Options, etc.) +- [ ] Regular security scanning/vulnerability assessments scheduled +- [ ] Incident response plan documented +- [ ] Access control audits scheduled \ No newline at end of file diff --git a/docs/DETECTION_RULES.md b/docs/DETECTION_RULES.md index 907ad0f..a6468a3 100644 --- a/docs/DETECTION_RULES.md +++ b/docs/DETECTION_RULES.md @@ -79,7 +79,7 @@ FlagWise includes pre-configured rules for common security patterns: | Rule Name | Type | Pattern | Points | Description | |-----------|------|---------|--------|-------------| | Critical Keywords | keyword | password\|secret\|api_key | 50 | Sensitive authentication data | -| Email Addresses | regex | `\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z\|a-z]{2,}\b` | 30 | Personal identifiable information | +| Email Addresses | regex | `\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b` | 30 | Personal identifiable information | | Credit Cards | regex | `\b(?:\d{4}[-\s]?){3}\d{4}\b` | 60 | Financial data patterns | | IP Addresses | regex | `\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b` | 20 | Network infrastructure data | | Phone Numbers | regex | `\b\d{3}-\d{3}-\d{4}\b` | 15 | Personal contact information | diff --git a/docs/USER_MANAGEMENT.md b/docs/USER_MANAGEMENT.md index 0128db2..2540223 100644 --- a/docs/USER_MANAGEMENT.md +++ b/docs/USER_MANAGEMENT.md @@ -32,7 +32,7 @@ FlagWise supports role-based access control with two user types: **Admin** and * 2. Click **Add User** 3. Fill in user details: - Username (3-50 characters, alphanumeric + hyphens/underscores) - - Password (minimum 6 characters) + - Password (minimum 8 characters recommended for production) - Role (Admin or Read-only) - First/Last name (optional) 4. Click **Create User** @@ -69,9 +69,10 @@ Instead of deleting users (which removes audit trails): ## Security Best Practices ### Password Requirements -- Minimum 6 characters (recommend 12+) +- **Production**: Minimum 8 characters (strongly recommend 12+) +- **Development**: Minimum 6 characters (for testing only) - Use unique passwords for each account -- Consider implementing password complexity rules +- Implement password complexity rules in production - Regular password rotation for admin accounts ### Access Control @@ -96,6 +97,7 @@ curl -X POST http://localhost:8000/auth/login \ -H "Content-Type: application/json" \ -d '{"username": "your_username", "password": "your_password"}' +# Note: Replace YOUR_ACCESS_TOKEN with the actual token from login response # Use token in subsequent requests curl -X GET http://localhost:8000/api/requests \ -H "Authorization: Bearer YOUR_ACCESS_TOKEN" @@ -116,8 +118,9 @@ curl -X GET http://localhost:8000/api/requests \ ### Forgot Password - Contact an Admin user for password reset -- No self-service password recovery currently available -- Consider implementing email-based recovery +- **Known Limitation**: No self-service password recovery currently available +- **Enhancement Needed**: Email-based or security-question recovery system +- This is a significant usability gap that should be prioritized for future releases ## Audit and Compliance From 9e3b47f8af22d5d6f6f436e5ab3872e5a0bcce15 Mon Sep 17 00:00:00 2001 From: Akinyemi Arabambi Date: Wed, 12 Nov 2025 15:34:55 +0000 Subject: [PATCH 3/3] docs: Fix critical issues in documentation --- docs/ALERTS.md | 14 ++++++++++---- docs/DATABASE.md | 13 ++++++++++--- docs/DEPLOYMENT.md | 6 +++--- docs/USER_MANAGEMENT.md | 9 +++++---- 4 files changed, 28 insertions(+), 14 deletions(-) diff --git a/docs/ALERTS.md b/docs/ALERTS.md index 8906255..b857f00 100644 --- a/docs/ALERTS.md +++ b/docs/ALERTS.md @@ -141,6 +141,12 @@ Each alert includes: - Investigation notes and actions taken - Resolution status and timeline +### Alert Lifecycle and Retention +- **New alerts**: Retained indefinitely until acknowledged/resolved +- **Resolved alerts**: Automatically cleaned up after 3 months +- **Archive function**: Removes from active view but preserves history +- **Manual cleanup**: Bulk delete resolved alerts older than specified date + ## Configuration Examples ### High-Risk Prompt Alert @@ -201,10 +207,10 @@ Each alert includes: - Add context-aware filtering ### Alert Fatigue Prevention -- Prioritize critical alerts -- Group related alerts together -- Implement escalation policies -- Regular review and tuning +- **Priority thresholds**: Critical (>80 risk), High (60-79), Medium (40-59) +- **Grouping rules**: Combine alerts from same IP within 5-minute window +- **Suppression**: Silence duplicate alerts for 1 hour after first occurrence +- **Escalation**: Auto-escalate unacknowledged critical alerts after 15 minutes ### Performance Optimization - Limit alert frequency per rule diff --git a/docs/DATABASE.md b/docs/DATABASE.md index a8d0f13..66e3fd5 100644 --- a/docs/DATABASE.md +++ b/docs/DATABASE.md @@ -220,7 +220,7 @@ WHERE is_active = TRUE; ## Data Relationships ### Entity Relationships -``` +```sql llm_requests (1) ←→ (0..n) alerts detection_rules (1) ←→ (0..n) alert_rules users (1) ←→ (0..n) alerts (acknowledged_by, resolved_by) @@ -232,8 +232,12 @@ user_sessions (1) ←→ (0..n) llm_requests (via src_ip + time range) ALTER TABLE alerts ADD CONSTRAINT fk_alerts_request FOREIGN KEY (related_request_id) REFERENCES llm_requests(id); -ALTER TABLE alert_rules ADD CONSTRAINT fk_alert_rules_detection_rules -FOREIGN KEY (detection_rule_ids) REFERENCES detection_rules(id); +-- Note: detection_rule_ids is a UUID[] array - FK constraints managed at application layer +-- For proper referential integrity, consider using a junction table: +-- CREATE TABLE alert_rules_detection_rules ( +-- alert_rule_id UUID REFERENCES alert_rules(id), +-- detection_rule_id UUID REFERENCES detection_rules(id) +-- ); ``` ## Data Retention and Cleanup @@ -262,7 +266,10 @@ $$ LANGUAGE plpgsql; ``` ### Scheduled Cleanup +**Prerequisites**: Requires `pg_cron` extension: ```sql +CREATE EXTENSION IF NOT EXISTS pg_cron; + -- Run cleanup monthly SELECT cron.schedule('cleanup-old-data', '0 2 1 * *', 'SELECT cleanup_old_data();'); ``` diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md index 5d12fb1..aa0661c 100644 --- a/docs/DEPLOYMENT.md +++ b/docs/DEPLOYMENT.md @@ -183,7 +183,7 @@ spec: spec: containers: - name: api - image: flagwise/api:latest + image: flagwise/api:v1.0.0 ports: - containerPort: 8000 envFrom: @@ -245,7 +245,7 @@ spec: spec: containers: - name: web - image: flagwise/web:latest + image: flagwise/web:v1.0.0 ports: - containerPort: 80 resources: @@ -500,7 +500,7 @@ kubectl rollout undo deployment/flagwise-api -n flagwise -- Weekly maintenance tasks VACUUM ANALYZE; REINDEX DATABASE flagwise; -UPDATE pg_stat_statements_reset(); +SELECT pg_stat_statements_reset(); ``` ### Log Rotation diff --git a/docs/USER_MANAGEMENT.md b/docs/USER_MANAGEMENT.md index 2540223..c61f4c3 100644 --- a/docs/USER_MANAGEMENT.md +++ b/docs/USER_MANAGEMENT.md @@ -70,9 +70,9 @@ Instead of deleting users (which removes audit trails): ### Password Requirements - **Production**: Minimum 8 characters (strongly recommend 12+) +- **Complexity**: Mix of uppercase, lowercase, digits, and special characters - **Development**: Minimum 6 characters (for testing only) - Use unique passwords for each account -- Implement password complexity rules in production - Regular password rotation for admin accounts ### Access Control @@ -84,8 +84,9 @@ Instead of deleting users (which removes audit trails): ### Session Management - Sessions expire after inactivity - Users are logged out when role changes -- Multiple concurrent sessions are allowed +- **Multiple concurrent sessions allowed** (monitor for suspicious activity) - JWT tokens are used for authentication +- **Security Note**: Monitor concurrent sessions from different IPs/locations ## API Access @@ -119,8 +120,8 @@ curl -X GET http://localhost:8000/api/requests \ ### Forgot Password - Contact an Admin user for password reset - **Known Limitation**: No self-service password recovery currently available -- **Enhancement Needed**: Email-based or security-question recovery system -- This is a significant usability gap that should be prioritized for future releases +- **Enhancement Planned**: Email-based recovery system (target: v1.2.0) +- This is a significant usability gap prioritized for Q2 2024 ## Audit and Compliance