Redis-backed bandit state for horizontal scaling

## Overview
Implement Redis-backed bandit state synchronization to enable multi-instance horizontal scaling.

## Background
From STAFF_REVIEW.md: *"Single-node, in-memory state breaks at scale"*

### Current Architecture (Single Instance)
```
Single instance          Problem
─────────────────────────────────────
Bandit state in memory   → Lost on restart
Local A_inv matrices     → No shared learning
Per-process middleware   → Can't scale horizontally
Semaphore concurrency    → Local-only coordination
```

### Target Architecture (Multi-Instance)
```
Multiple instances       Solution
─────────────────────────────────────
Redis bandit state       → Shared across instances
Distributed A_inv        → Consistent UCB calculations
Redis rate limiting      → Global coordination
Distributed locks        → Safe concurrent updates
```

## Implementation Tasks

### 1. Redis State Serialization
```python
# Store bandit state in Redis
class RedisBackedBandit:
    def __init__(self, redis_client):
        self.redis = redis_client
        self.key_prefix = "conduit:bandit"
    
    def save_state(self, algorithm_name: str, state: dict):
        key = f"{self.key_prefix}:{algorithm_name}"
        self.redis.set(key, json.dumps(state))
    
    def load_state(self, algorithm_name: str) -> dict:
        key = f"{self.key_prefix}:{algorithm_name}"
        data = self.redis.get(key)
        return json.loads(data) if data else {}
```

### 2. Atomic Matrix Updates (LinUCB)
**Challenge:** Multiple instances updating A_inv matrices concurrently

**Solution Options:**

**Option A: Distributed Locks (Strong Consistency)**
```python
from redis.lock import Lock

def update_linucb(self, arm, reward, context):
    lock = self.redis.lock(f"lock:linucb:{arm}", timeout=10)
    with lock:
        # Load current A_inv from Redis
        A_inv = self.load_matrix(arm)
        # Update with Woodbury identity
        A_inv_updated = woodbury_update(A_inv, context, reward)
        # Save back to Redis
        self.save_matrix(arm, A_inv_updated)
```

**Option B: Eventual Consistency (Accept Divergence)**
```python
# Each instance maintains local A_inv
# Periodic sync with Redis (every N updates)
# Accept that instances may have slightly different UCB scores
# Document tradeoffs in ARCHITECTURE.md
```

**Recommendation:** Start with **Option B** (eventual consistency) for better performance, upgrade to **Option A** if consistency issues observed.

### 3. Distributed Rate Limiting
```python
from redis import Redis

class RedisRateLimiter:
    def __init__(self, redis: Redis, max_qps: int = 100):
        self.redis = redis
        self.max_qps = max_qps
    
    def acquire(self, provider: str) -> bool:
        key = f"rate_limit:{provider}"
        current = self.redis.incr(key)
        if current == 1:
            self.redis.expire(key, 1)  # 1-second window
        return current <= self.max_qps
```

### 4. State Compaction & Cleanup
```python
# Leader election for cleanup tasks
def elect_leader():
    lock = redis.lock("conduit:leader", timeout=30)
    if lock.acquire(blocking=False):
        # This instance is the leader
        compact_old_state()
        cleanup_expired_keys()
        lock.release()
```

## Success Criteria
- [ ] Redis state backend implemented in `conduit_bench/backends/redis.py`
- [ ] All bandit algorithms support Redis persistence
- [ ] Atomic matrix updates for LinUCB (choose consistency model)
- [ ] Distributed rate limiting working
- [ ] Leader election for cleanup tasks
- [ ] Multi-instance deployment example (Docker Compose or Kubernetes)
- [ ] Load test validates performance with 3+ instances
- [ ] Documentation in `docs/SCALING.md`

## Testing Strategy
```bash
# Start 3 instances with shared Redis
docker-compose up --scale conduit=3

# Run load test targeting all 3 instances
k6 run tests/load/multi_instance.js

# Verify state consistency across instances
pytest tests/integration/test_redis_sync.py
```

## Consistency Model Decision
Document tradeoffs in `docs/SCALING.md`:

| Model | Latency | Consistency | Complexity |
|-------|---------|-------------|------------|
| Strong (locks) | +50ms | Perfect | High |
| Eventual | +5ms | ~98% | Medium |
| Optimistic | +2ms | ~95% | Low |

## Priority
**MEDIUM** - Important for production scale, not needed for research

## Difficulty
**Advanced** - Requires distributed systems design expertise

## Dependencies
- Redis 6.0+ (for distributed locks)
- Load testing suite (Issue #24) for validation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redis-backed bandit state for horizontal scaling #27

Overview

Background

Current Architecture (Single Instance)

Target Architecture (Multi-Instance)

Implementation Tasks

1. Redis State Serialization

2. Atomic Matrix Updates (LinUCB)

3. Distributed Rate Limiting

4. State Compaction & Cleanup

Success Criteria

Testing Strategy

Consistency Model Decision

Priority

Difficulty

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	Latency	Consistency	Complexity
Strong (locks)	+50ms	Perfect	High
Eventual	+5ms	~98%	Medium
Optimistic	+2ms	~95%	Low

Redis-backed bandit state for horizontal scaling #27

Description

Overview

Background

Current Architecture (Single Instance)

Target Architecture (Multi-Instance)

Implementation Tasks

1. Redis State Serialization

2. Atomic Matrix Updates (LinUCB)

3. Distributed Rate Limiting

4. State Compaction & Cleanup

Success Criteria

Testing Strategy

Consistency Model Decision

Priority

Difficulty

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions