Skip to content

Commit dc289c4

Browse files
Add 5/5 AI/ML infrastructure: lakehouse connector, model registry, and A/B testing
Lakehouse Connector (lakehouse_connector.py): - Connect ML training to real lakehouse data - Query lakehouse for training datasets - Generate fraud detection, risk scoring, and churn prediction datasets - Support for both real lakehouse data and synthetic fallback - Feature extraction from transaction, user, and risk data Model Registry (model_registry.py): - MLflow-compatible model versioning and experiment tracking - Model lifecycle management (development -> staging -> production -> archived) - Experiment tracking with metrics and parameters - Model comparison and promotion - Artifact storage and retrieval - Local file-based registry with MLflow integration option A/B Testing Infrastructure (ab_testing.py): - Traffic splitting between model versions (random, hash-based, gradual rollout, multi-armed bandit) - Statistical significance testing (t-test, chi-squared, effect size) - Experiment lifecycle management (draft -> running -> paused -> completed) - Real-time metrics collection per variant - Automatic winner selection based on primary metric - Gradual rollout support for safe deployments ML Service Integration (main.py): - /registry/* endpoints for model versioning - /ab-test/* endpoints for A/B testing experiments - /lakehouse/* endpoints for training data generation - /train/from-lakehouse endpoint for training from real data - Full integration with model registry and A/B testing This completes the path to 5/5 AI/ML production readiness: - Real training data from lakehouse (not just synthetic) - MLflow-compatible model registry for versioning - A/B testing for safe model deployments Co-Authored-By: Patrick Munis <[email protected]>
1 parent 09c7ddb commit dc289c4

File tree

4 files changed

+2786
-0
lines changed

4 files changed

+2786
-0
lines changed

0 commit comments

Comments
 (0)