Skip to content

Update to version 3.0#63

Open
thomasmeissnercrm wants to merge 18 commits intomainfrom
update_repo
Open

Update to version 3.0#63
thomasmeissnercrm wants to merge 18 commits intomainfrom
update_repo

Conversation

@thomasmeissnercrm
Copy link
Collaborator

@thomasmeissnercrm thomasmeissnercrm commented Mar 6, 2026

BlueCast 3.0 — Major Release

Summary

This is a comprehensive release that adds 7 new modules, fixes 15+ bugs, improves code quality across the entire codebase, and introduces 11 example scripts with synthetic data. The core library remains lightweight — all new heavy features (AI agents, model serving) are optional extras.

New Features

1. Unified Interface (BlueCastAuto)

A single entry point replaces the need to choose between 4 blueprint classes. Set class_problem and use_cross_validation — the right backend is selected automatically.

from bluecast.blueprints.unified import BlueCastAuto

automl = BlueCastAuto(class_problem="binary", use_cross_validation=True)
automl.fit(df, target_col="target")

2. Ensemble Strategies (bluecast/ensemble/)

CV pipelines now support three ensemble strategies via EnsembleConfig:

  • Mean blending (arithmetic, geometric, harmonic, median)
  • Stacking — Ridge meta-learner on rank-transformed OOF predictions
  • Hill climbing — greedy forward selection with configurable weight bounds, inspired by competition-winning approaches
from bluecast.ensemble.ensemble_config import EnsembleConfig

config = EnsembleConfig(ensemble_strategy="hill_climbing", hc_blending_method="rank")

3. Group-Conditional Conformal Prediction

Conformal prediction now supports per-group calibration. Different subgroups (e.g., product categories, regions) get uncertainty intervals tailored to their specific error distributions.

wrapper.calibrate(x_cal, y_cal, group_columns=["product_group"])
intervals = wrapper.predict_interval(x_test, alphas=[0.1])

Includes new evaluation functions: prediction_interval_coverage_by_group(), prediction_interval_spans_by_group(), conformal_fairness_check().

4. Fairness Evaluation (bluecast/evaluation/fairness.py)

FairnessAuditor computes demographic parity, equalized odds, equal opportunity, predictive parity, and AUC parity for classification; MAE/RMSE ratios for regression. Integrates into fit_eval via TrainingConfig.fairness_sensitive_columns.

auditor = FairnessAuditor(sensitive_columns=["gender", "age_group"])
reports = auditor.audit_classification(y_true, y_pred, y_probs, df_eval)

5. BlueCastAI — Multi-Agent LLM-Powered AutoML (bluecast/ai/, optional)

An optional module (pip install bluecast[ai]) that uses a multi-agent system to analyze data, engineer features, and build optimized pipelines — guided by natural language prompts. Supports Gemini, OpenAI, and Anthropic.

  • 7 specialized agents (Planner, DataAnalyst, FeatureEngineer, PipelineBuilder, Evaluator, Researcher, Reporter)
  • Smart data sampling for large datasets
  • Checkpoint save/resume for long runs
  • Structured I/O logging with timestamps and metadata
  • LLM-generated Markdown reports
from bluecast.ai import BlueCastAI

ai = BlueCastAI(api_key="...", provider="gemini")
result = ai.run(df, target_col="target", prompt="Build a precise model", mode="precise")
result.save_code("pipeline.py")
result.save_report("report.md")

6. Model Serving (bluecast/serve/, optional)

One-command API deployment (pip install bluecast[serve]). Auto-generates request schemas from the pipeline's column metadata.

from bluecast.serve import serve, export_api

serve(automl, port=8080)                        # local FastAPI server
export_api(automl, output_dir="./deployment")   # standalone app.py + Dockerfile

7. Improved Linear/Logistic Regression

  • LogisticRegressionModel now searches L1, L2, and ElasticNet with wider C range
  • New RegularizedRegressionModel auto-selects between Ridge, Lasso, ElasticNet
  • PreprocessingForLinearModels gains configurable scaler, imputation strategy, collinearity threshold, and optional polynomial features via LinearModelPreprocessingConfig
  • LinearRegressionModel.predict() return type fixed to match BaseClassMlModel

Bug Fixes

  • _apply_pandas_query_filter: replace("=", "==") was corrupting >=, <=, != operators. Fixed with regex.
  • detect_categorical_leakage: was mutating the caller's DataFrame in-place.
  • adversarial_validation: was mutating input DataFrames. Now copies first.
  • InFrequentCategoryEncoder.transform: was mutating input. Now copies.
  • XgboostModel.fit: params.pop("steps") was mutating config permanently. Now uses .get().
  • FeatureTypeDetector.fit_transform_feature_types: datetime casting used df instead of df_clean.
  • eval_metrics.py: error message referenced probas_best_class instead of probas_target_class.
  • plot_theil_u_heatmap: return type declared as go.Figure but actually returns Tuple[go.Figure, np.ndarray].
  • detect_leakage_via_correlation: docstring said "returns True/False" but returns a list.
  • RegressionEvalWrapper: docstring said "classification metrics".
  • eval_regressor: used print() instead of logging.info().
  • univariate_plots / plot_ecdf: crashed on columns with NaN values due to np.arange with NaN bounds.
  • shap_explanations: added compatibility guard for shap + xgboost >= 3.0 segfault, with fallback to KernelExplainer.
  • correlation_heatmap / correlation_to_target: added numeric_only=True to prevent pandas deprecation warnings.

EDA Improvements

  • 5 functions (univariate_plots, bi_variate_plots, plot_count_pairs, plot_classification_target_distribution_within_categories, plot_error_distributions) now return figures and accept show=True parameter instead of always calling fig.show()
  • find_bind_with_with_freedman_diaconis renamed to find_bin_width_with_freedman_diaconis (old name kept as deprecated alias)
  • conditional_entropy and theil_u — added type hints and docstrings
  • plot_tsne — added NaN handling (was crashing), type hints for perplexity/random_state
  • plot_benfords_law — now reports chi-square p-value when scipy is available
  • Cache decorator now uses functools.wraps to preserve function metadata
  • Dashboard code deduplicated (_create_benford_plot_classification and _create_category_frequency_plot_classification now delegate to shared implementations)
  • Created bluecast/eda/__init__.py with curated exports for all 27 public functions
  • Replaced print() with warnings.warn() for optional import failures

Code Quality

  • Added bluecast/__init__.py with __all__ exports for clean public API
  • Added TrainingConfig.fairness_sensitive_columns with docstring
  • All pre-commit hooks pass: black, mypy, isort, flake8, markdownlint

Examples

Replaced 10 outdated Jupyter notebooks (requiring external Kaggle datasets) with 11 self-contained Python scripts using synthetic data:

Script Topics
00_full_showcase.py End-to-end walkthrough of all features
01_quick_start.py Binary, multiclass, regression, fit_eval
02_cross_validation_and_ensembles.py Mean blending, stacking, hill climbing
03_conformal_prediction.py Global + group-conditional uncertainty
04_linear_models.py Logistic/Ridge/Lasso, preprocessing config
05_unified_interface.py BlueCastAuto for all problem types
06_advanced_customization.py Custom preprocessing, XGBoost, drift monitoring, save/load
07_fairness.py Demographic parity, equalized odds, conformal fairness
08_eda.py Univariate/bivariate plots, PCA, t-SNE, data quality
09_bluecast_ai.py Multi-agent LLM-powered AutoML
10_serving.py Deploy models as REST APIs

Tests

Added 105+ new tests across 7 test files:

  • test_fairness.py — 16 tests for FairnessAuditor
  • test_ensemble.py — 20 tests for hill climbing, stacking, mean blending
  • test_unified.py — 12 tests for BlueCastAuto
  • test_conformal_group.py — 11 tests for group-conditional conformal prediction
  • test_ai_module.py — 30 tests for AI config, context, tools, result
  • test_serve.py — 16 tests for schemas, app factory, exporter
  • Updated test_custom_model_recipes.py and test_preprocessing_recipes.py for new APIs

Dependencies

No new mandatory dependencies. All new features use optional extras:

[tool.poetry.extras]
ai = ["google-generativeai", "openai", "anthropic"]
serve = ["fastapi", "uvicorn"]

@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 69.27124% with 1168 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.04%. Comparing base (b9f0e0a) to head (e58aace).

Files with missing lines Patch % Lines
bluecast/ai/orchestrator.py 0.00% 241 Missing ⚠️
bluecast/ai/agents/base.py 0.00% 66 Missing ⚠️
bluecast/ai/providers/gemini.py 0.00% 65 Missing ⚠️
bluecast/ai/agents/pipeline_builder.py 0.00% 64 Missing ⚠️
bluecast/ai/tools.py 57.14% 60 Missing ⚠️
bluecast/serve/app.py 32.58% 60 Missing ⚠️
bluecast/ai/providers/anthropic_provider.py 0.00% 52 Missing ⚠️
bluecast/evaluation/fairness.py 77.72% 51 Missing ⚠️
bluecast/ai/providers/openai_provider.py 0.00% 47 Missing ⚠️
bluecast/ai/agents/reporter.py 0.00% 42 Missing ⚠️
... and 30 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #63      +/-   ##
==========================================
- Coverage   89.19%   84.04%   -5.16%     
==========================================
  Files         106      149      +43     
  Lines       11152    14237    +3085     
==========================================
+ Hits         9947    11965    +2018     
- Misses       1205     2272    +1067     
Flag Coverage Δ
pytest 84.04% <69.27%> (-5.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants