Skip to content

Conversation

@yossiovadia
Copy link
Collaborator

Switch PII classification from hardcoded ModernBERT to auto-detecting Candle BERT classifier. The Rust layer already has built-in auto-detection that checks for lora_config.json and routes to LoRA or Traditional models.

Changes:

  1. Init: Use InitCandleBertTokenClassifier (has auto-detect built-in)
  2. Inference: Use ClassifyCandleBertTokens (auto-routes to initialized classifier)

This enables LoRA PII models to work automatically without config changes, providing higher confidence scores for PII entity detection.

Fixes #647

@netlify
Copy link

netlify bot commented Nov 20, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 839f26e
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/691fa60f7668af00088e32e1
😎 Deploy Preview https://deploy-preview-709--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 deploy

Owners: @rootfs, @Xunzhuo
Files changed:

  • deploy/helm/semantic-router/values.yaml
  • deploy/kubernetes/aibrix/semantic-router-values/values.yaml

📁 e2e

Owners: @Xunzhuo
Files changed:

  • e2e/profiles/ai-gateway/values.yaml
  • e2e/profiles/dynamic-config/values.yaml

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/classification/classifier.go
  • src/semantic-router/pkg/classification/classifier_test.go
  • src/semantic-router/pkg/extproc/extproc_test.go

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Switch PII classification from hardcoded ModernBERT to auto-detecting
Candle BERT classifier. The Rust layer already has built-in auto-detection
that checks for lora_config.json and routes to LoRA or Traditional models.

Changes:
1. Init: Use InitCandleBertTokenClassifier (has auto-detect built-in)
2. Inference: Use ClassifyCandleBertTokens (auto-routes to initialized classifier)

This enables LoRA PII models to work automatically without config changes,
providing higher confidence scores for PII entity detection.

Fixes vllm-project#647

Signed-off-by: Yossi Ovadia <[email protected]>
@yossiovadia
Copy link
Collaborator Author

@Xunzhuo , please review

@yossiovadia
Copy link
Collaborator Author

Test Results - Issue #647 Verification

I've tested this PR locally with the router running and can confirm dramatic improvements in PII detection:

Performance Comparison

Metric Before (ModernBERT) After (LoRA - This PR) Improvement
Success Rate 27% (10/37) 90% (9/10) +233%
Confidence Scores 0.37-0.75 (inconsistent) 0.866-0.999 More reliable
Threshold Passing Many below 0.7 All above 0.7 100% compliance

Critical Test Cases from Issue #647

All the problematic cases from issue #647 now work correctly:

SSN (123-45-6789): 0.648 → 0.999 confidence
Email ([email protected]): 0.637 → 0.968 confidence
Email with dots ([email protected]): 0.574 → 0.995 confidence
Phone numbers: All at 0.938-0.989 confidence
Person names: 0.995-0.999 confidence
Locations (GPE): 0.9+ confidence

Test script results: 9/10 tests passing (90% success rate), far exceeding the 73% improvement target from issue #647 .

Router Logs Confirmation

The router logs show excellent confidence scores from the LoRA model:

{"level":"info","msg":"Detected PII entity: B-PERSON ('john') at [0-4] with confidence 0.999"}
{"level":"info","msg":"Detected PII entity: I-PERSON ('doe') at [3-6] with confidence 0.995"}
{"level":"info","msg":"Detected PII entity: B-EMAIL_ADDRESS ('john') at [8-12] with confidence 0.968"}
{"level":"info","msg":"Detected PII entity: B-PHONE_NUMBER ('555') at [9-12] with confidence 0.938"}
{"level":"info","msg":"Detected PII entity: I-PHONE_NUMBER ('-') at [4-5] with confidence 0.866"}

All confidence scores are well above the 0.7 threshold, confirming the LoRA model is working as expected.


Follow-up Improvement Opportunity

While testing, I noticed that the /api/v1/classify/pii endpoint hardcodes confidence to 0.9 instead of returning the actual scores from the LoRA model.

Location: src/semantic-router/pkg/services/classification.go:313

for _, piiType := range piiTypes {
    entity := PIIEntity{
        Type:       piiType,
        Value:      "[DETECTED]",
        Confidence: 0.9,  // ❌ Hardcoded - should use actual confidence from model
    }
    response.Entities = append(response.Entities, entity)
}

Current behavior:

  • LoRA model returns precise scores (0.866-0.999) ✅
  • Router logs show real scores ✅
  • API response hardcodes 0.9 ⚠️

Suggested improvement (separate PR):
The ClassifyPII method should return entity-level details (including confidence scores), not just entity types. This would allow the API to expose the actual confidence scores instead of using a placeholder.

This is a pre-existing limitation and doesn't affect this PR's functionality - the underlying LoRA detection is working perfectly. Just flagging for future enhancement.


I'll proceed once this PR will be approved and merged.

@yossiovadia
Copy link
Collaborator Author

even more results :
esting the EXACT cases from the issue #647 table:

Test Case Text Before (ModernBERT) After (LoRA) Status
SSN My SSN is 123-45-6789 0.648 ❌ 0.992 FIXED
Email [email protected] 0.637 ❌ 0.794-0.977 FIXED
Email (dots) [email protected] 0.574 ❌ 0.961-0.987 FIXED
Credit Card 4111-1111-1111-1111 0.368 ❌ No detection STILL FAILS
Credit Card (sentence) Card number 4111-1111-1111-1111 0.474 ❌ 0.867 FIXED (misclassified as SSN)

Detailed Findings

✅ Test 1: SSN - My SSN is 123-45-6789

Before: 0.648 confidence (below 0.7 threshold) ❌
After: 0.992 confidence

Router logs:

Detected PII entity: B-US_SSN ('123') at [0-3] with confidence 0.992
Detected PII entity: B-US_SSN ('-') at [1-2] with confidence 0.833

Result:FIXED - Now detects correctly with very high confidence


✅ Test 2: Email - [email protected]

Before: 0.637 confidence (below 0.7 threshold) ❌
After: 0.794-0.977 confidence

Router logs:

Detected PII entity: B-EMAIL_ADDRESS ('john') at [0-4] with confidence 0.977

Result:FIXED - Now detects correctly above threshold


✅ Test 3: Email with dots - [email protected]

Before: 0.574 confidence (below 0.7 threshold) ❌
After: 0.961-0.987 confidence

Router logs:

Detected PII entity: B-EMAIL_ADDRESS ('john') at [0-4] with confidence 0.987
Detected PII entity: B-EMAIL_ADDRESS ('example') at [21-28] with confidence 0.855

Result:FIXED - Now detects correctly with high confidence


❌ Test 4: Credit Card - 4111-1111-1111-1111

Before: 0.368 confidence (below 0.7 threshold) ❌
After: No PII detected

API Response: {"has_pii": false}

Note: This test case still fails. Credit cards with dashes are not detected by the LoRA model. This is a training data limitation.


⚠️ Test 5: Credit Card (sentence) - Card number 4111-1111-1111-1111

Before: 0.474 confidence (below 0.7 threshold) ❌
After: 0.867 confidence ✅ (but misclassified)

Router logs:

Detected PII entity: B-US_SSN ('411') at [0-3] with confidence 0.867
Detected PII entity: I-PHONE_NUMBER ('##1') at [18-21] with confidence 0.725

API Response: Detected as SSN + PHONE_NUMBER (not CREDIT_CARD)

Result: ⚠️ PARTIALLY FIXED - PII is now detected (preventing data leakage), but misclassified as SSN instead of CREDIT_CARD


Additional Test: Credit Card without dashes

For comparison, I also tested credit cards without dashes:

✅ Credit Card - 4111111111111111 (no dashes)

Router logs:

Detected PII entity: B-CREDIT_CARD ('411') at [0-3] with confidence 0.989
Detected PII entity: B-CREDIT_CARD ('##11') at [4-8] with confidence 0.959
Detected PII entity: B-CREDIT_CARD ('##11') at [8-12] with confidence 0.935
Detected PII entity: B-CREDIT_CARD ('##11') at [12-16] with confidence 0.900
Detected PII entity: B-CREDIT_CARD ('##11') at [16-20] with confidence 0.793

Result:WORKS PERFECTLY - Credit cards without dashes are detected correctly with high confidence (0.793-0.989)


Summary

4 out of 5 exact test cases from issue #647 are now FIXED:

✅ SSN: 0.648 → 0.992 (FIXED)
✅ Email: 0.637 → 0.977 (FIXED)
✅ Email (dots): 0.574 → 0.987 (FIXED)
❌ Credit Card (dashes): 0.368 → not detected (STILL FAILS)
⚠️ Credit Card (sentence): 0.474 → 0.867 (FIXED but misclassified)

Overall improvement: 80% success rate on the exact test cases from issue #647

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PII Model Produces Insufficient Confidence Scores for Critical PII Types

4 participants