-
Notifications
You must be signed in to change notification settings - Fork 83
Open
Description
Add a test case to test_autoai_output_consumption.py covering the following scenario:
- Read an output AutoAI pipeline.
- Use
DisparateImpactRemoveron the preprocessing prefix and perform refinement with a choice of classifiers. - Use Hyperopt to choose the best model with the pre-estimator mitigation of step 2.
Here is some code for using the pipeline generated for the German credit dataset:
fairness_info = {
"protected_attributes": [
{"feature": "Sex", "reference_group": ['male'], "monitored_group": ['female']},
{"feature": "Age", "reference_group": [[20,40], [60,90]], "monitored_group": [[41, 59]]}
],
"favorable_labels": ["No Risk"],
"unfavorable_labels": ["Risk"],
}
prefix = best_pipeline.remove_last().freeze_trainable()
from sklearn.linear_model import LogisticRegression as LR
from sklearn.ensemble import RandomForestClassifier as RF
from lale.operator_wrapper import wrap_imported_operators
from lale.lib.aif360 import DisparateImpactRemover
wrap_imported_operators()
di_remover = DisparateImpactRemover(**fairness_info, preparation=prefix, redact=True)
planned_fairer = di_remover >> (LR | RF)
from lale.lib.aif360 import accuracy_and_disparate_impact
from lale.lib.aif360 import FairStratifiedKFold
combined_scorer = accuracy_and_disparate_impact(**fairness_info)
fair_cv = FairStratifiedKFold(**fairness_info, n_splits=3)
from lale.lib.lale import Hyperopt
import pandas as pd
df = pd.read_csv("german_credit_data_biased_training.csv")
y = df.iloc[:, -1]
X = df.drop(columns=['Risk'])
trained_fairer = planned_fairer.auto_configure(
X, y, optimizer=Hyperopt, cv=fair_cv, verbose=True,
max_evals=1, scoring=combined_scorer, best_score=1.0)Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels