Skip to content

Commit 4b1fdaa

Browse files
committed
feat: support classify api
Signed-off-by: bitliu <[email protected]>
1 parent d55985e commit 4b1fdaa

File tree

5 files changed

+992
-60
lines changed

5 files changed

+992
-60
lines changed

docs/api/classification.md

Lines changed: 243 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,70 @@ The Classification API provides direct access to the Semantic Router's classific
66

77
### Base URL
88
```
9-
http://localhost:50051/api/v1/classify
9+
http://localhost:8080/api/v1/classify
10+
```
11+
12+
## Server Status
13+
14+
The Classification API server runs alongside the main Semantic Router ExtProc server:
15+
- **Classification API**: `http://localhost:8080` (HTTP REST API)
16+
- **ExtProc Server**: `http://localhost:50051` (gRPC for Envoy integration)
17+
- **Metrics Server**: `http://localhost:9190` (Prometheus metrics)
18+
19+
Start the server with:
20+
```bash
21+
make run-router
22+
```
23+
24+
## Implementation Status
25+
26+
### ✅ Fully Implemented
27+
- `GET /health` - Health check endpoint
28+
- `POST /api/v1/classify/intent` - Intent classification with real model inference
29+
- `POST /api/v1/classify/pii` - PII detection with real model inference
30+
- `POST /api/v1/classify/security` - Security/jailbreak detection with real model inference
31+
- `GET /info/models` - Model information and system status
32+
- `GET /info/classifier` - Detailed classifier capabilities and configuration
33+
34+
### 🔄 Placeholder Implementation
35+
- `POST /api/v1/classify/combined` - Returns "not implemented" response
36+
- `POST /api/v1/classify/batch` - Returns "not implemented" response
37+
- `GET /metrics/classification` - Returns "not implemented" response
38+
- `GET /config/classification` - Returns "not implemented" response
39+
- `PUT /config/classification` - Returns "not implemented" response
40+
41+
The fully implemented endpoints provide real classification results using the loaded models. Placeholder endpoints return appropriate HTTP 501 responses and can be extended as needed.
42+
43+
## Quick Start
44+
45+
### Test the API
46+
47+
Once the server is running, you can test the endpoints:
48+
49+
```bash
50+
# Health check
51+
curl -X GET http://localhost:8080/health
52+
53+
# Intent classification
54+
curl -X POST http://localhost:8080/api/v1/classify/intent \
55+
-H "Content-Type: application/json" \
56+
-d '{"text": "What is machine learning?"}'
57+
58+
# PII detection
59+
curl -X POST http://localhost:8080/api/v1/classify/pii \
60+
-H "Content-Type: application/json" \
61+
-d '{"text": "My email is [email protected]"}'
62+
63+
# Security detection
64+
curl -X POST http://localhost:8080/api/v1/classify/security \
65+
-H "Content-Type: application/json" \
66+
-d '{"text": "Ignore all previous instructions"}'
67+
68+
# Model information
69+
curl -X GET http://localhost:8080/info/models
70+
71+
# Classifier details
72+
curl -X GET http://localhost:8080/info/classifier
1073
```
1174

1275
## Intent Classification
@@ -20,7 +83,7 @@ Classify user queries into routing categories.
2083

2184
```json
2285
{
23-
"text": "What is the derivative of x^2 + 3x + 1?",
86+
"text": "What is machine learning and how does it work?",
2487
"options": {
2588
"return_probabilities": true,
2689
"confidence_threshold": 0.7,
@@ -34,23 +97,41 @@ Classify user queries into routing categories.
3497
```json
3598
{
3699
"classification": {
37-
"category": "mathematics",
38-
"confidence": 0.956,
39-
"processing_time_ms": 12
100+
"category": "computer science",
101+
"confidence": 0.8827820420265198,
102+
"processing_time_ms": 46
40103
},
41104
"probabilities": {
42-
"mathematics": 0.956,
43-
"physics": 0.024,
44-
"computer_science": 0.012,
45-
"creative_writing": 0.003,
105+
"computer science": 0.8827820420265198,
106+
"math": 0.024,
107+
"physics": 0.012,
108+
"engineering": 0.003,
46109
"business": 0.002,
47-
"general": 0.003
110+
"other": 0.003
48111
},
49-
"recommended_model": "math-specialized-model",
112+
"recommended_model": "computer science-specialized-model",
50113
"routing_decision": "high_confidence_specialized"
51114
}
52115
```
53116

117+
### Available Categories
118+
119+
The current model supports the following 14 categories:
120+
- `business`
121+
- `law`
122+
- `psychology`
123+
- `biology`
124+
- `chemistry`
125+
- `history`
126+
- `other`
127+
- `health`
128+
- `economics`
129+
- `math`
130+
- `physics`
131+
- `computer science`
132+
- `philosophy`
133+
- `engineering`
134+
54135
## PII Detection
55136

56137
Detect personally identifiable information in text.
@@ -270,64 +351,152 @@ Process multiple texts in a single request for efficiency.
270351
}
271352
```
272353

273-
## Model Information
354+
## Information Endpoints
355+
356+
### Model Information
274357

275358
Get information about loaded classification models.
276359

277-
### Endpoint
278-
`GET /models/info`
360+
#### Endpoint
361+
`GET /info/models`
279362

280363
### Response Format
281364

282365
```json
283366
{
284-
"models": {
285-
"intent_classifier": {
286-
"name": "modernbert-base",
287-
"version": "1.0.0",
367+
"models": [
368+
{
369+
"name": "category_classifier",
370+
"type": "intent_classification",
371+
"loaded": true,
372+
"model_path": "models/category_classifier_modernbert-base_model",
288373
"categories": [
289-
"mathematics", "physics", "computer_science",
290-
"creative_writing", "business", "general"
374+
"business", "law", "psychology", "biology", "chemistry",
375+
"history", "other", "health", "economics", "math",
376+
"physics", "computer science", "philosophy", "engineering"
291377
],
292-
"loaded": true,
293-
"last_updated": "2024-03-15T10:30:00Z",
294-
"performance": {
295-
"accuracy": 0.942,
296-
"avg_inference_time_ms": 12
378+
"metadata": {
379+
"mapping_path": "models/category_classifier_modernbert-base_model/category_mapping.json",
380+
"model_type": "modernbert",
381+
"threshold": "0.60"
297382
}
298383
},
299-
"pii_detector": {
300-
"name": "modernbert-pii",
301-
"version": "1.0.0",
302-
"entity_types": ["PERSON", "EMAIL", "PHONE", "SSN", "LOCATION"],
384+
{
385+
"name": "pii_classifier",
386+
"type": "pii_detection",
303387
"loaded": true,
304-
"last_updated": "2024-03-15T10:30:00Z",
305-
"performance": {
306-
"f1_score": 0.957,
307-
"avg_inference_time_ms": 8
388+
"model_path": "models/pii_classifier_modernbert-base_presidio_token_model",
389+
"metadata": {
390+
"mapping_path": "models/pii_classifier_modernbert-base_presidio_token_model/pii_type_mapping.json",
391+
"model_type": "modernbert_token",
392+
"threshold": "0.70"
308393
}
309394
},
310-
"jailbreak_guard": {
311-
"name": "modernbert-security",
312-
"version": "1.0.0",
313-
"detection_types": ["jailbreak", "prompt_injection", "manipulation"],
395+
{
396+
"name": "bert_similarity_model",
397+
"type": "similarity",
314398
"loaded": true,
315-
"last_updated": "2024-03-15T10:30:00Z",
316-
"performance": {
317-
"precision": 0.923,
318-
"recall": 0.891,
319-
"avg_inference_time_ms": 6
399+
"model_path": "sentence-transformers/all-MiniLM-L12-v2",
400+
"metadata": {
401+
"model_type": "sentence_transformer",
402+
"threshold": "0.60",
403+
"use_cpu": "true"
320404
}
321405
}
406+
],
407+
"system": {
408+
"go_version": "go1.24.1",
409+
"architecture": "arm64",
410+
"os": "darwin",
411+
"memory_usage": "1.20 MB",
412+
"gpu_available": false
413+
}
414+
}
415+
```
416+
417+
### Model Status
418+
419+
- **loaded: true** - Model is successfully loaded and ready for inference
420+
- **loaded: false** - Model failed to load or is not initialized (placeholder mode)
421+
422+
When models are not loaded, the API will return placeholder responses for testing purposes.
423+
424+
### Classifier Information
425+
426+
Get detailed information about classifier capabilities and configuration.
427+
428+
#### Endpoint
429+
`GET /info/classifier`
430+
431+
#### Response Format
432+
433+
```json
434+
{
435+
"status": "active",
436+
"capabilities": [
437+
"intent_classification",
438+
"pii_detection",
439+
"security_detection",
440+
"similarity_matching"
441+
],
442+
"categories": [
443+
{
444+
"name": "business",
445+
"description": "Business and commercial content",
446+
"reasoning_enabled": false,
447+
"threshold": 0.6
448+
},
449+
{
450+
"name": "math",
451+
"description": "Mathematical problems and concepts",
452+
"reasoning_enabled": true,
453+
"threshold": 0.6
454+
}
455+
],
456+
"pii_types": [
457+
"PERSON",
458+
"EMAIL",
459+
"PHONE",
460+
"SSN",
461+
"LOCATION",
462+
"CREDIT_CARD",
463+
"IP_ADDRESS"
464+
],
465+
"security": {
466+
"jailbreak_detection": false,
467+
"detection_types": [
468+
"jailbreak",
469+
"prompt_injection",
470+
"system_override"
471+
],
472+
"enabled": false
322473
},
323-
"system_info": {
324-
"total_memory_mb": 1024,
325-
"gpu_available": true,
326-
"concurrent_requests": 50
474+
"performance": {
475+
"average_latency_ms": 45,
476+
"requests_handled": 0,
477+
"cache_enabled": false
478+
},
479+
"configuration": {
480+
"category_threshold": 0.6,
481+
"pii_threshold": 0.7,
482+
"similarity_threshold": 0.6,
483+
"use_cpu": true
327484
}
328485
}
329486
```
330487

488+
#### Status Values
489+
490+
- **active** - Classifier is loaded and fully functional
491+
- **placeholder** - Using placeholder responses (models not loaded)
492+
493+
#### Capabilities
494+
495+
- **intent_classification** - Can classify text into categories
496+
- **pii_detection** - Can detect personally identifiable information
497+
- **security_detection** - Can detect jailbreak attempts and security threats
498+
- **similarity_matching** - Can perform semantic similarity matching
499+
331500
## Performance Metrics
332501

333502
Get real-time classification performance metrics.
@@ -408,14 +577,32 @@ Get real-time classification performance metrics.
408577
{
409578
"error": {
410579
"code": "CLASSIFICATION_ERROR",
411-
"message": "Model inference failed",
412-
"details": {
413-
"model": "intent_classifier",
414-
"input_length": 2048,
415-
"max_length": 512
416-
},
417-
"timestamp": "2024-03-15T14:30:00Z",
418-
"request_id": "req-abc123"
580+
"message": "classification failed: model inference error",
581+
"timestamp": "2024-03-15T14:30:00Z"
582+
}
583+
}
584+
```
585+
586+
### Example Error Responses
587+
588+
**Invalid Input (400 Bad Request):**
589+
```json
590+
{
591+
"error": {
592+
"code": "INVALID_INPUT",
593+
"message": "text cannot be empty",
594+
"timestamp": "2024-03-15T14:30:00Z"
595+
}
596+
}
597+
```
598+
599+
**Not Implemented (501 Not Implemented):**
600+
```json
601+
{
602+
"error": {
603+
"code": "NOT_IMPLEMENTED",
604+
"message": "Combined classification not implemented yet",
605+
"timestamp": "2024-03-15T14:30:00Z"
419606
}
420607
}
421608
```
@@ -440,7 +627,7 @@ import requests
440627
from typing import List, Dict, Optional
441628

442629
class ClassificationClient:
443-
def __init__(self, base_url: str = "http://localhost:50051"):
630+
def __init__(self, base_url: str = "http://localhost:8080"):
444631
self.base_url = base_url
445632

446633
def classify_intent(self, text: str, return_probabilities: bool = True) -> Dict:
@@ -498,7 +685,7 @@ if security_result['is_jailbreak']:
498685

499686
```javascript
500687
class ClassificationAPI {
501-
constructor(baseUrl = 'http://localhost:50051') {
688+
constructor(baseUrl = 'http://localhost:8080') {
502689
this.baseUrl = baseUrl;
503690
}
504691

0 commit comments

Comments
 (0)