Commit dc289c4
Add 5/5 AI/ML infrastructure: lakehouse connector, model registry, and A/B testing
Lakehouse Connector (lakehouse_connector.py):
- Connect ML training to real lakehouse data
- Query lakehouse for training datasets
- Generate fraud detection, risk scoring, and churn prediction datasets
- Support for both real lakehouse data and synthetic fallback
- Feature extraction from transaction, user, and risk data
Model Registry (model_registry.py):
- MLflow-compatible model versioning and experiment tracking
- Model lifecycle management (development -> staging -> production -> archived)
- Experiment tracking with metrics and parameters
- Model comparison and promotion
- Artifact storage and retrieval
- Local file-based registry with MLflow integration option
A/B Testing Infrastructure (ab_testing.py):
- Traffic splitting between model versions (random, hash-based, gradual rollout, multi-armed bandit)
- Statistical significance testing (t-test, chi-squared, effect size)
- Experiment lifecycle management (draft -> running -> paused -> completed)
- Real-time metrics collection per variant
- Automatic winner selection based on primary metric
- Gradual rollout support for safe deployments
ML Service Integration (main.py):
- /registry/* endpoints for model versioning
- /ab-test/* endpoints for A/B testing experiments
- /lakehouse/* endpoints for training data generation
- /train/from-lakehouse endpoint for training from real data
- Full integration with model registry and A/B testing
This completes the path to 5/5 AI/ML production readiness:
- Real training data from lakehouse (not just synthetic)
- MLflow-compatible model registry for versioning
- A/B testing for safe model deployments
Co-Authored-By: Patrick Munis <[email protected]>1 parent 09c7ddb commit dc289c4
File tree
4 files changed
+2786
-0
lines changed- core-services/ml-service
4 files changed
+2786
-0
lines changed
0 commit comments