You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-fairness-ml.md
+4-12Lines changed: 4 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,9 +17,6 @@ ms.custom: responsible-ml
17
17
18
18
This article describes methods you can use for understanding your model performance and fairness in Azure Machine Learning.
19
19
20
-
21
-
22
-
23
20
## What is machine learning fairness?
24
21
25
22
Artificial intelligence and machine learning systems can display unfair behavior. One way to define unfair behavior is by its harm, or impact on people. There are many types of harm that AI systems can give rise to. To learn more, [NeurIPS 2017 keynote by Kate Crawford](https://www.youtube.com/watch?v=fMym_BKWQzk).
@@ -35,12 +32,10 @@ To reduce unfair behavior in AI systems, you have to assess and mitigate these h
35
32
>[!NOTE]
36
33
> Fairness is a socio-technical challenge. Many aspects of fairness, such as justice and due process, are not captured in quantitative fairness metrics. Also, many quantitative fairness metrics can't all be satisfied simultaneously. The goal of the Fairlearn open-source package is to enable humans to assess the different impact and mitigation strategies. Ultimately, it is up to the human users building artificial intelligence and machine learning models to make trade-offs that are appropriate to their scenario.
37
34
38
-
39
35
In this component of the Responsible AI dashboard, fairness is conceptualized through an approach known as **group fairness**, which asks: Which groups of individuals are at risk for experiencing harm? The term **sensitive features** suggests that the system designer should be sensitive to these features when assessing group fairness.
40
36
41
37
During the assessment phase, fairness is quantified through disparity metrics. **Disparity metrics** can evaluate and compare model behavior across different groups either as ratios or as differences. The Responsible AI dashboard supports two classes of disparity metrics:
42
38
43
-
44
39
- Disparity in model performance: These sets of metrics calculate the disparity (difference) in the values of the selected performance metric across different subgroups of data. Some examples include:
45
40
46
41
- disparity in accuracy rate
@@ -52,14 +47,11 @@ During the assessment phase, fairness is quantified through disparity metrics. *
52
47
53
48
- Disparity in selection rate: This metric contains the difference in selection rate (favorable prediction) among different subgroups. An example of this is disparity in loan approval rate. Selection rate means the fraction of data points in each class classified as 1 (in binary classification) or distribution of prediction values (in regression).
54
49
55
-
56
50
The fairness assessment capabilities of this component are founded by the [Fairlearn](https://fairlearn.org/) package, providing a collection of model fairness assessment metrics and unfairness mitigation algorithms.
57
51
58
-
59
52
>[!NOTE]
60
53
> A fairness assessment is not a purely technical exercise. The Fairlearn open-source package can help you assess the fairness of a model, but it will not perform the assessment for you. The Fairlearn open-source package helps identify quantitative metrics to assess fairness, but developers must also perform a qualitative analysis to evaluate the fairness of their own models. The sensitive features noted above is an example of this kind of qualitative analysis.
61
54
62
-
63
55
## Mitigate unfairness in machine learning models
64
56
65
57
Upon understanding your model's fairness issues, you can use [Fairlearn](https://fairlearn.org/)'s mitigation algorithms to mitigate your observed fairness issues.
@@ -87,10 +79,10 @@ The Fairlearn open-source package provides postprocessing and reduction unfairne
87
79
88
80
| Algorithm | Description | Machine learning task | Sensitive features | Supported parity constraints | Algorithm Type |
89
81
| --- | --- | --- | --- | --- | --- |
90
-
|`ExponentiatedGradient`| Black-box approach to fair classification described in [A Reductions Approach to Fair Classification](https://arxiv.org/abs/1803.02453)| Binary classification | Categorical |[Demographic parity](#parity-constraints), [equalized odds](#parity-constraints)| Reduction |
91
-
|`GridSearch`| Black-box approach described in [A Reductions Approach to Fair Classification](https://arxiv.org/abs/1803.02453)| Binary classification | Binary |[Demographic parity](#parity-constraints), [equalized odds](#parity-constraints)| Reduction |
92
-
|`GridSearch`| Black-box approach that implements a grid-search variant of Fair Regression with the algorithm for bounded group loss described in [Fair Regression: Quantitative Definitions and Reduction-based Algorithms](https://arxiv.org/abs/1905.12843)| Regression | Binary |[Bounded group loss](#parity-constraints)| Reduction |
93
-
|`ThresholdOptimizer`| Postprocessing algorithm based on the paper [Equality of Opportunity in Supervised Learning](https://arxiv.org/abs/1610.02413). This technique takes as input an existing classifier and the sensitive feature, and derives a monotone transformation of the classifier's prediction to enforce the specified parity constraints. | Binary classification | Categorical |[Demographic parity](#parity-constraints), [equalized odds](#parity-constraints)| Post-processing |
82
+
|`ExponentiatedGradient`| Black-box approach to fair classification described in [A Reductions Approach to Fair Classification](https://arxiv.org/abs/1803.02453)| Binary classification | Categorical | Demographic parity, equalized odds| Reduction |
83
+
|`GridSearch`| Black-box approach described in [A Reductions Approach to Fair Classification](https://arxiv.org/abs/1803.02453)| Binary classification | Binary | Demographic parity, equalized odds | Reduction |
84
+
|`GridSearch`| Black-box approach that implements a grid-search variant of Fair Regression with the algorithm for bounded group loss described in [Fair Regression: Quantitative Definitions and Reduction-based Algorithms](https://arxiv.org/abs/1905.12843)| Regression | Binary | Bounded group loss| Reduction |
85
+
|`ThresholdOptimizer`| Postprocessing algorithm based on the paper [Equality of Opportunity in Supervised Learning](https://arxiv.org/abs/1610.02413). This technique takes as input an existing classifier and the sensitive feature, and derives a monotone transformation of the classifier's prediction to enforce the specified parity constraints. | Binary classification | Categorical | Demographic parity, equalized odds| Post-processing |
@@ -19,7 +19,7 @@ Implementing Responsible AI in practice requires rigorous engineering. Rigorous
19
19
The Responsible AI dashboard provides a single pane of glass that brings together several mature Responsible AI tools in the areas of model [performance and fairness assessment](http://fairlearn.org/), data exploration, [machine learning interpretability](https://interpret.ml/), [error analysis](https://erroranalysis.ai/), [counterfactual analysis and perturbations](https://github.com/interpretml/DiCE), and [causal inference](https://github.com/microsoft/EconML) for a holistic assessment and debugging of models and making informed data-driven decisions. Having access to all of these tools in one interface empowers you to:
20
20
21
21
1. Evaluate and debug your machine learning models by identifying model errors and fairness issues, diagnosing why those errors are happening, and informing your mitigation steps.
22
-
2. Boost your data-driven decision-making abilities by addressing questions such as *“what is the minimum change the end user could apply to their features to get a different outcome from the model?” and/or “what is the causal effect of reducing or increasing a feature (e.g., red meat consumption) on a real-world outcome (e.g., diabetes progression)?”*
22
+
2. Boost your data-driven decision-making abilities by addressing questions such as *“what is the minimum change the end user could apply to their features to get a different outcome from the model?” and/or “what is the causal effect of reducing or increasing a feature (for example, red meat consumption) on a real-world outcome (for example, diabetes progression)?”*
23
23
24
24
The dashboard could be customized to include the only subset of tools that are relevant to your use case.
25
25
@@ -65,7 +65,7 @@ Below are the components of the Responsible AI dashboard supporting model debugg
65
65
| Identify | Model Overview | The Model Overview component aggregates various model assessment metrics, showing a high-level view of model prediction distribution for better investigation of its performance. It also enables group fairness assessment, highlighting the breakdown of model performance across different sensitive groups. |
66
66
| Diagnose | Data Explorer | The Data Explorer component helps to visualize datasets based on predicted and actual outcomes, error groups, and specific features. This helps to identify issues of over- and underrepresentation and to see how data is clustered in the dataset. |
67
67
| Diagnose | Model Interpretability | The Interpretability component generates human-understandable explanations of the predictions of a machine learning model. It provides multiple views into a model’s behavior: global explanations (for example, which features affect the overall behavior of a loan allocation model) and local explanations (for example, why an applicant’s loan application was approved or rejected). <br><br> The capabilities of this component in the dashboard are founded by the [InterpretML](https://interpret.ml/) package. |
68
-
| Diagnose | Counterfactual Analysis and What-If| The Counterfactual Analysis and what-if component consist of two functionalities for better error diagnosis: <br> - Generating a set of examples with minimal changes to a given point such that those changes alter the model's prediction (a.k.a. showing the closest data points with opposite model predictions). <br> - Enabling interactive and custom what-if perturbations for individual data points to understand how the model reacts to feature changes. <br> <br> The capabilities of this component in the dashboard are founded by the [DiCE](https://github.com/interpretml/DiCE) package. |
68
+
| Diagnose | Counterfactual Analysis and What-If| The Counterfactual Analysis and what-if component consist of two functionalities for better error diagnosis: <br> - Generating a set of examples with minimal changes to a given point such that those changes alter the model's prediction (showing the closest data points with opposite model predictions). <br> - Enabling interactive and custom what-if perturbations for individual data points to understand how the model reacts to feature changes. <br> <br> The capabilities of this component in the dashboard are founded by the [DiCE](https://github.com/interpretml/DiCE) package. |
69
69
70
70
Mitigation steps are available via standalone tools such as [Fairlearn](https://fairlearn.org/) (see [unfairness mitigation algorithms](https://fairlearn.org/v0.7.0/user_guide/mitigation.html)).
71
71
@@ -83,7 +83,7 @@ Exploratory data analysis, counterfactual analysis, and causal inference capabil
83
83
Below are the components of the Responsible AI dashboard supporting responsible decision-making:
84
84
85
85
-**Data Explorer**
86
-
- The component could be reused here to understand data distributions and identify over- and underrepresentation. Data exploration is a critical part of decision making as one can conclude that it is not feasible to make informed decisions about a cohort that is underrepresented within data.
86
+
- The component could be reused here to understand data distributions and identify over- and underrepresentation. Data exploration is a critical part of decision making as one can conclude that it isn't feasible to make informed decisions about a cohort that is underrepresented within data.
87
87
-**Causal Inference**
88
88
- The Causal Inference component estimates how a real-world outcome changes in the presence of an intervention. It also helps to construct promising interventions by simulating different feature responses to various interventions and creating rules to determine which population cohorts would benefit from a particular intervention. Collectively, these functionalities allow you to apply new policies and effect real-world change.
89
89
- The capabilities of this component are founded by the [EconML](https://github.com/Microsoft/EconML) package, which estimates heterogeneous treatment effects from observational data via machine learning.
@@ -95,16 +95,18 @@ Below are the components of the Responsible AI dashboard supporting responsible
95
95
96
96
### Challenges with the status quo
97
97
98
-
While progress has been made on individual tools for specific areas of Responsible AI, data scientists often need to use various tools (e.g., performance assessment and model interpretability and fairness assessment) together, to holistically evaluate their models and data. For example, if a data scientist discovers a fairness issue with one tool, they then need to jump to a different tool to understand what data or model factors lie at the root of the issue before taking any steps on mitigation. This highly challenging process is further complicated for the following reasons.
99
-
- First, there is no central location to discover and learn about the tools, extending the time it takes to research and learn new techniques.
100
-
- Second, the different tools do not exactly communicate with each other. Data scientists must wrangle the datasets, models, and other metadata as they pass them between the different tools. - Third, the metrics and visualizations are not easily comparable, and the results are hard to share.
98
+
While progress has been made on individual tools for specific areas of Responsible AI, data scientists often need to use various tools (for example, performance assessment and model interpretability and fairness assessment) together, to holistically evaluate their models and data. For example, if a data scientist discovers a fairness issue with one tool, they then need to jump to a different tool to understand what data or model factors lie at the root of the issue before taking any steps on mitigation. This highly challenging process is further complicated for the following reasons.
99
+
100
+
- First, there's no central location to discover and learn about the tools, extending the time it takes to research and learn new techniques.
101
+
- Second, the different tools don't exactly communicate with each other. Data scientists must wrangle the datasets, models, and other metadata as they pass them between the different tools. - Third, the metrics and visualizations aren't easily comparable, and the results are hard to share.
101
102
102
103
### Responsible AI dashboard challenging the status quo
104
+
103
105
The Responsible AI dashboard is the first comprehensive yet customizable tool, bringing together fragmented experiences under one roof, enabling you to seamlessly onboard to a single customizable framework for model debugging and data-driven decision making.
104
106
105
-
Using the Responsible AI dashboard, you can create dataset cohorts (subgroups of data), pass those cohorts to all of the supported components (for example, model interpretability, data explorer, model performance, etc) and observe your model health for your identified cohorts. You can further compare insights from all supported components across a variety of pre-built cohorts to perform disaggregated analysis and find the blind spots of your model.
107
+
Using the Responsible AI dashboard, you can create dataset cohorts (subgroups of data), pass those cohorts to all of the supported components (for example, model interpretability, data explorer, model performance, etc.) and observe your model health for your identified cohorts. You can further compare insights from all supported components across a variety of pre-built cohorts to perform disaggregated analysis and find the blind spots of your model.
106
108
107
-
Whenever you are ready to share those insights with other stakeholders, you can extract them easily via our [Responsible AI PDF scorecard](how-to-responsible-ai-scorecard.md)) and attach the PDF report to your compliance reports or simply share it with other colleagues to build trust and get their approval.
109
+
Whenever you're ready to share those insights with other stakeholders, you can extract them easily via our [Responsible AI PDF scorecard](how-to-responsible-ai-scorecard.md)) and attach the PDF report to your compliance reports or share it with other colleagues to build trust and get their approval.
108
110
109
111
110
112
## How to customize the Responsible AI dashboard?
@@ -131,7 +133,7 @@ The Responsible AI dashboard, and its corresponding [Responsible AI scorecard](h
131
133
- Product managers and business stakeholders who are reviewing machine learning models pre-deployment.
132
134
- Risk officers who are reviewing machine learning models for understanding fairness and reliability issues.
133
135
- Providers of solutions to end users who would like to explain model decisions to the end users and/or help them improve the outcome next time.
134
-
- Those professionals in heavily-regulated spaces who need to review machine learning models with regulators and auditors.
136
+
- Those professionals in heavilyregulated spaces who need to review machine learning models with regulators and auditors.
135
137
136
138
## Supported scenarios and limitations
137
139
@@ -140,11 +142,7 @@ The Responsible AI dashboard, and its corresponding [Responsible AI scorecard](h
140
142
- The Responsible AI dashboard currently visualizes up to 5K of your data points in the dashboard UI. You should downsample your dataset to 5K or less before passing it to the dashboard.
141
143
- The dataset inputs to the Responsible AI dashboard must be pandas DataFrames in Parquet format. Numpy and Scipy sparse data are currently not supported.
142
144
- The Responsible AI dashboard currently supports numeric or categorical features. For categorical features, currently the user has to explicitly specify the feature names.
143
-
- The Responsible AI dashboard currently does not support datasets with more than 10K columns.
144
-
145
-
146
-
147
-
145
+
- The Responsible AI dashboard currently doesn't support datasets with more than 10K columns.
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-responsible-ai-dashboard-ui.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ You can create a Responsible AI dashboard with a no-code experience in the [Azur
21
21
- Select the registered model you’d like to create Responsible AI insights for and select the **Details** tab.
22
22
- Select the **Create Responsible AI dashboard (preview)** button from the top panel.
23
23
24
-
To learn more, see the Responsible AI dashboard's [supported model types, and limitations](concept-responsible-ai-dashboard.md#supported-machine-learning-models-and-scenarios)
24
+
To learn more, see the Responsible AI dashboard's [supported model types, and limitations](concept-responsible-ai-dashboard.md#supported-scenarios-and-limitations)
25
25
26
26
27
27
:::image type="content" source="./media/how-to-responsible-ai-dashboard-ui/model-page.png" alt-text="Screenshot of the wizard details tab with create responsible AI dashboard tab highlighted." lightbox ="./media/how-to-responsible-ai-dashboard-ui/model-page.png":::
0 commit comments