ExtProc PII Detection Without Policy Enforcement

# ExtProc PII Detection Without Policy Enforcement

## Issue Summary

The ExtProc pipeline currently **detects PII content** during routing decisions but **does not enforce PII policies** to block requests containing sensitive information. This creates a security gap where PII content is identified but allowed to pass through to language models.

## Current Behavior

✅ **PII Detection Working:**
- Batch API correctly detects PII types (B-US_SSN, B-EMAIL_ADDRESS, etc.) with ~99% confidence
- Direct PII API identifies entities accurately
- ExtProc detects PII during model routing decisions

❌ **PII Policy Enforcement Missing:**
- Requests containing sensitive PII (SSN, multiple PII types) pass through ExtProc pipeline
- No content filter responses for policy violations
- PII policies in configuration are not enforced during request processing

## Test Evidence

The comprehensive PII test suite we created in `e2e-tests/06-pii-detection-test.py` (commits `2edbd04` and `1327a1c`) includes specific tests that demonstrate this gap:

**TEST 3: ExtProc PII Filtering** - Tests if PII content gets blocked:
- **Input**: Multi-PII content: "John Smith, SSN: 987-65-4321, email: jsmith@company.com, phone: 555-987-6543"
- **Expected**: Content filter response blocking the request
- **Actual**: Status 200, request processed normally with generic refusal

**TEST 3.5: ExtProc PII Detection - Comprehensive Production Pipeline** - Validates PII detection evidence:
```
Test Results with Original Code:
PII Detection Evidence: ❌ NO
Detection Indicators: None found
PII Request Blocked: ❌ NO
Differential Routing: ❌ NO (both PII and safe content → Model-B)
Overall Assessment: ⚠️ NO CLEAR PII DETECTION
```

**Example Request:**
```json
{
  "model": "auto",
  "messages": [{"role": "user", "content": "John Smith, SSN: 987-65-4321, email: jsmith@company.com, phone: 555-987-6543"}]
}
```

**Current Response:** Status 200 - Request processed normally without blocking

## Root Cause

In `src/semantic-router/pkg/extproc/request_handler.go`, the `performSecurityChecks` function:

1. **Line 414:** Extracts PII content for analysis
2. **Line 476:** Performs jailbreak detection with blocking
3. **Missing:** Upfront PII policy enforcement before routing

PII detection only occurs during model routing (lines 641-716) but not as a security check.

## Proposed Solution

Add PII policy enforcement to `performSecurityChecks` function in `request_handler.go`:

### Location: After line 414

```go
// Perform PII detection and policy enforcement
if r.PIIChecker != nil {
    // Start PII detection span
    spanCtx, span := observability.StartSpan(ctx.TraceContext, "pii_detection")
    defer span.End()

    startTime := time.Now()
    detectedPII := r.Classifier.DetectPIIInContent(allContent)
    detectionTime := time.Since(startTime).Milliseconds()

    observability.SetSpanAttributes(span,
        attribute.Int64("pii_detection_time_ms", detectionTime))

    if len(detectedPII) > 0 {
        observability.Infof("PII detected: %v", detectedPII)

        // Check PII policy against default model (Model-A has strictest policy)
        allowed, deniedPII, err := r.PIIChecker.CheckPolicy("Model-A", detectedPII)

        if err != nil {
            observability.Errorf("Error checking PII policy: %v", err)
            observability.RecordError(span, err)
            metrics.RecordRequestError(ctx.RequestModel, "pii_policy_check_failed")
        } else if !allowed {
            observability.SetSpanAttributes(span,
                attribute.Bool("pii_detected", true),
                attribute.StringSlice("denied_pii_types", deniedPII),
                attribute.String("security_action", "blocked"))

            observability.Warnf("PII POLICY VIOLATION BLOCKED: %v (denied types: %v)", detectedPII, deniedPII)

            // Structured log for security block
            observability.LogEvent("security_block", map[string]interface{}{
                "reason_code":    "pii_policy_violation",
                "detected_pii":   detectedPII,
                "denied_pii":     deniedPII,
                "request_id":     ctx.RequestID,
            })

            // Count this as a blocked request
            metrics.RecordRequestError(ctx.RequestModel, "pii_policy_block")
            piiResponse := http.CreatePIIViolationResponse("Model-A", deniedPII)
            ctx.TraceContext = spanCtx
            return piiResponse, true
        } else {
            observability.SetSpanAttributes(span,
                attribute.Bool("pii_detected", true),
                attribute.StringSlice("detected_pii_types", detectedPII),
                attribute.String("security_action", "allowed"))
            observability.Infof("PII detected but allowed by policy: %v", detectedPII)
        }
    } else {
        observability.SetSpanAttributes(span,
            attribute.Bool("pii_detected", false))
        observability.Infof("No PII detected in request content")
    }
    ctx.TraceContext = spanCtx
}
```

## Expected Behavior After Fix

**With PII Policy Enforcement:**
```json
{
  "id": "chatcmpl-pii-violation-123",
  "choices": [{
    "finish_reason": "content_filter",
    "message": {
      "content": "I cannot process this request as it contains personally identifiable information ([PERSON]) that is not allowed for the 'Model-A' model according to the configured privacy policy."
    }
  }]
}
```

## Configuration Integration

The fix leverages existing PII policy configuration in `config.e2e.yaml`:

```yaml
"Model-A":
  pii_policy:
    allow_by_default: false  # Strict PII blocking
    pii_types_allowed: ["EMAIL_ADDRESS"]  # Only emails allowed
```

## Testing

**Our newly created test suite** in `06-pii-detection-test.py` (branch: `feature/improve-pii-extproc-testing`) provides comprehensive validation:

**TEST 1: Batch API PII Detection** - ✅ WORKING
- Validates unified classifier detects PII correctly (B-US_SSN, confidence ~99%)

**TEST 2: Direct PII API Endpoint** - ✅ WORKING
- Confirms direct PII detection works as expected

**TEST 3: ExtProc PII Filtering** - ❌ REVEALS GAP
- **Purpose**: Tests if ExtProc blocks PII content in production pipeline
- **Finding**: Requests with multiple PII types pass through without blocking
- **Evidence**: No content_filter finish_reason, normal status 200 responses

**TEST 3.5: ExtProc PII Detection - Comprehensive Production Pipeline** - ❌ REVEALS GAP
- **Purpose**: Analyzes differential behavior between PII and safe content
- **Finding**: No differential routing, processing time, or blocking behavior
- **Evidence**: "NO CLEAR PII DETECTION" in ExtProc pipeline

**TEST 4: Multiple PII Types Pattern Analysis** - ✅ WORKING
- Validates high detection rates across various PII entity types

**Key Finding**: The test suite proves PII **detection** works perfectly but PII **enforcement** is missing in ExtProc.

## Impact

- **Security**: Prevents sensitive PII from reaching language models
- **Compliance**: Enforces configured privacy policies
- **Observability**: Proper logging and metrics for PII violations
- **Consistency**: Aligns ExtProc behavior with PII policy configuration

## Files Modified

- `src/semantic-router/pkg/extproc/request_handler.go` (security enforcement)
- `config/config.e2e.yaml` (PII policy configuration)
- `e2e-tests/06-pii-detection-test.py` (comprehensive test validation)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ExtProc PII Detection Without Policy Enforcement #336

ExtProc PII Detection Without Policy Enforcement

Issue Summary

Current Behavior

Test Evidence

Root Cause

Proposed Solution

Location: After line 414

Expected Behavior After Fix

Configuration Integration

Testing

Impact

Files Modified

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ExtProc PII Detection Without Policy Enforcement #336

Description

ExtProc PII Detection Without Policy Enforcement

Issue Summary

Current Behavior

Test Evidence

Root Cause

Proposed Solution

Location: After line 414

Expected Behavior After Fix

Configuration Integration

Testing

Impact

Files Modified

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions