-
Notifications
You must be signed in to change notification settings - Fork 15
feat: Add xgb detectors #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add xgb detectors #42
Conversation
Reviewer's GuideThis PR introduces a full XGBoost-based spam detection feature, comprising a training pipeline with data loading, text preprocessing, TF-IDF feature extraction, hyperparameter search, model artifact serialization, a runtime detector class with batching and CUDA support, FastAPI-based service integration with Prometheus instrumentation, end-to-end integration tests, and supporting documentation and Docker setup. Sequence diagram for FastAPI XGB detector request handlingsequenceDiagram
actor User
participant FastAPI as FastAPI app
participant Detector as Detector
participant Model as XGB Model
User->>FastAPI: POST /api/v1/text/contents
FastAPI->>Detector: run(request)
Detector->>Model: vectorizer.transform(text)
Detector->>Model: model.predict(vectorized_text)
Detector-->>FastAPI: ContentAnalysisResponse
FastAPI-->>User: ContentsAnalysisResponse
Class diagram for new XGB detector componentsclassDiagram
class Detector {
- model
- vectorizer
- cuda_device
- batch_size
+ __init__()
+ run(request: ContentAnalysisHttpRequest) ContentAnalysisResponse
}
Detector --> ContentAnalysisHttpRequest
Detector --> ContentAnalysisResponse
class ContentAnalysisHttpRequest
class ContentAnalysisResponse
class DetectorBaseAPI
class FastAPI
DetectorBaseAPI <|-- FastAPI
class DetectorRegistry
Detector ..> DetectorRegistry : uses
Detector ..> logger : logs
Detector ..> torch : uses
Detector ..> xgb : uses
Detector ..> pickle : loads model
Detector ..> TfidfVectorizer : uses
Detector ..> GridSearchCV : uses
Detector ..> PorterStemmer : uses
Detector ..> stopwords : uses
Detector ..> pd : uses
Detector ..> load_dataset : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary by Sourcery
Add XGBoost-based SMS spam detector with end-to-end training, inference API, and containerization
New Features:
Build:
Documentation:
Tests: