Skip to content

Commit c910c47

Browse files
committed
edits
1 parent 1829214 commit c910c47

12 files changed

+106
-94
lines changed

articles/machine-learning/concept-causal-inference.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.subservice: enterprise-readiness
88
ms.topic: how-to
99
ms.author: mesameki
1010
author: mesameki
11-
ms.date: 08/08/2022
11+
ms.date: 08/17/2022
1212
ms.custom: responsible-ml, event-tier1-build-2022
1313
---
1414

@@ -53,5 +53,5 @@ Then the method combines these two predictive models in a final stage estimation
5353
## Next steps
5454

5555
- Learn how to generate the Responsible AI dashboard via [CLIv2 and SDKv2](how-to-responsible-ai-dashboard-sdk-cli.md) or [studio UI](how-to-responsible-ai-dashboard-ui.md).
56-
- Explore the [supported causal inference visualizations](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-responsible-ai-dashboard#causal-analysis) of the Responsible AI dashboard.
56+
- Explore the [supported causal inference visualizations](how-to-responsible-ai-dashboard.md#causal-analysis) of the Responsible AI dashboard.
5757
- Learn how to generate a [Responsible AI scorecard](how-to-responsible-ai-scorecard.md) based on the insights observed in the Responsible AI dashboard.

articles/machine-learning/concept-counterfactual-analysis.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.subservice: enterprise-readiness
88
ms.topic: how-to
99
ms.author: mesameki
1010
author: mesameki
11-
ms.date: 08/08/2022
11+
ms.date: 08/17/2022
1212
ms.custom: responsible-ml, event-tier1-build-2022
1313
---
1414

@@ -36,7 +36,7 @@ Use What-If Counterfactuals when you need to:
3636

3737
## How are counterfactual examples generated?
3838

39-
To generate counterfactuals, DiCE implements a few model-agnostic techniques. These methods apply to any opaque-box classifier or regressor. They're based on sampling nearby points to an input point, while optimizing a loss function based on proximity (and optionally, sparsity, diversity, and feasibility). Currently-supported methods are:
39+
To generate counterfactuals, DiCE implements a few model-agnostic techniques. These methods apply to any opaque-box classifier or regressor. They're based on sampling nearby points to an input point, while optimizing a loss function based on proximity (and optionally, sparsity, diversity, and feasibility). Currently supported methods are:
4040

4141
- [Randomized Search](http://interpret.ml/DiCE/notebooks/DiCE_model_agnostic_CFs.html#1.-Independent-random-sampling-of-features): Samples points randomly near the given query point and returns counterfactuals as those points whose predicted label is the desired class.
4242
- [Genetic Search](http://interpret.ml/DiCE/notebooks/DiCE_model_agnostic_CFs.html#2.-Genetic-Algorithm): Samples points using a genetic algorithm, given the combined objective of optimizing proximity to the given query point, changing as few features as possible, and diversity among the counterfactuals generated.
@@ -45,5 +45,5 @@ To generate counterfactuals, DiCE implements a few model-agnostic techniques. Th
4545
## Next steps
4646

4747
- Learn how to generate the Responsible AI dashboard via [CLIv2 and SDKv2](how-to-responsible-ai-dashboard-sdk-cli.md) or [studio UI](how-to-responsible-ai-dashboard-ui.md).
48-
- Explore the [supported counterfactual analysis and what-if perturbation visualizations](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-responsible-ai-dashboard#counterfactual-what-if) of the Responsible AI dashboard.
48+
- Explore the [supported counterfactual analysis and what-if perturbation visualizations](how-to-responsible-ai-dashboard.md#counterfactual-what-if) of the Responsible AI dashboard.
4949
- Learn how to generate a [Responsible AI scorecard](how-to-responsible-ai-scorecard.md) based on the insights observed in the Responsible AI dashboard.

articles/machine-learning/concept-data-analysis.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,13 @@ ms.subservice: enterprise-readiness
88
ms.topic: how-to
99
ms.author: mesameki
1010
author: mesameki
11-
ms.date: 08/08/2022
11+
ms.date: 08/17/2022
1212
ms.custom: responsible-ml, event-tier1-build-2022
1313
---
1414

1515
# Understand your datasets (preview)
1616

17-
Machine learning models "learn" from historical decisions and actions captured in training data. As a result, their performance in real-world scenarios is heavily influenced by the data they are trained on. When feature distribution in a dataset is skewed, this can cause a model to incorrectly predict data points belonging to an underrepresented group or to be optimized along an inappropriate metric. For example, while training a housing price prediction AI, the training set was representing 75% of newer houses that have less than median prices. As a result, it was much less accurate in successfully identifying more expensive historic houses. The fix was to add older and expensive houses to the training data and augment the features to include insights about the historic value of the house. Upon incorporating that data augmentation, results improved.
17+
Machine learning models "learn" from historical decisions and actions captured in training data. As a result, their performance in real-world scenarios is heavily influenced by the data they're trained on. When feature distribution in a dataset is skewed, it can cause a model to incorrectly predict data points belonging to an underrepresented group or to be optimized along an inappropriate metric. For example, while training a housing price prediction AI, the training set was representing 75% of newer houses that have less than median prices. As a result, it was much less accurate in successfully identifying more expensive historic houses. The fix was to add older and expensive houses to the training data and augment the features to include insights about the historic value of the house. Upon incorporating that data augmentation, results improved.
1818

1919
The Data Explorer component of the [Responsible AI dashboard](concept-responsible-ai-dashboard.md) helps visualize datasets based on predicted and actual outcomes, error groups, and specific features. This enables you to identify issues of over- and under-representation and to see how data is clustered in the dataset. Data visualizations consist of aggregate plots or individual data points.
2020

articles/machine-learning/concept-error-analysis.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.subservice: enterprise-readiness
88
ms.topic: how-to
99
ms.author: mesameki
1010
author: mesameki
11-
ms.date: 05/10/2022
11+
ms.date: 08/17/2022
1212
ms.custom: responsible-ml, event-tier1-build-2022
1313
---
1414
# Assess errors in ML models (preview)
@@ -40,13 +40,13 @@ Often, error patterns may be complex and involve more than one or two features.
4040
- **Error coverage**: a portion of all errors that fall into the node. This is shown through the fill rate of the node.
4141
- **Data representation**: number of instances in each node of the error tree. This is shown through the thickness of the incoming edge to the node along with the actual total number of instances in the node.
4242

43-
:::image type="content" source="./media/concept-error-analysis/error-analysis-tree.png" alt-text="Error Analysis tree showing cohorts with higher or lower error rates and coverage":::
43+
:::image type="content" source="./media/concept-error-analysis/error-analysis-tree.png" alt-text="Screenshot of an error Analysis tree showing cohorts with higher or lower error rates and coverage." lightbox ="./media/concept-error-analysis/error-analysis-tree.png":::
4444

4545
## Error Heatmap
4646

4747
The view slices the data based on a one- or two-dimensional grid of input features. Users can choose the input features of interest for analysis. The heatmap visualizes cells with higher error with a darker red color to bring the user’s attention to regions with high error discrepancy. This is beneficial especially when the error themes are different in different partitions, which happen frequently in practice. In this error identification view, the analysis is highly guided by the users and their knowledge or hypotheses of what features might be most important for understanding failure.
4848

49-
:::image type="content" source="./media/concept-error-analysis/error-analysis-heatmap.png" alt-text="Error Analysis heatmap showing model errors partitioned by one or two features":::
49+
:::image type="content" source="./media/concept-error-analysis/error-analysis-heatmap.png" alt-text="Screenshot of an error Analysis heatmap showing model errors partitioned by one or two features.":::
5050

5151
## Next steps
5252

articles/machine-learning/concept-fairness-ml.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ ms.service: machine-learning
77
ms.subservice: enterprise-readiness
88
ms.topic: conceptual
99
ms.author: mesameki
10-
author: lgayhardt
11-
ms.date: 08/08/2022
10+
author: mesameki
11+
ms.date: 08/17/2022
1212
ms.custom: responsible-ml
1313
#Customer intent: As a data scientist, I want to learn about machine learning fairness and how to assess and mitigate unfairness in machine learning models.
1414
---
@@ -22,15 +22,15 @@ This article describes methods you can use for understanding your model performa
2222

2323
## What is machine learning fairness?
2424

25-
Artificial intelligence and machine learning systems can display unfair behavior. One way to define unfair behavior is by its harm, or impact on people. There are many types of harm that AI systems can give rise to. See the [NeurIPS 2017 keynote by Kate Crawford](https://www.youtube.com/watch?v=fMym_BKWQzk) to learn more.
25+
Artificial intelligence and machine learning systems can display unfair behavior. One way to define unfair behavior is by its harm, or impact on people. There are many types of harm that AI systems can give rise to. To learn more, [NeurIPS 2017 keynote by Kate Crawford](https://www.youtube.com/watch?v=fMym_BKWQzk).
2626

2727
Two common types of AI-caused harms are:
2828

2929
- Harm of allocation: An AI system extends or withholds opportunities, resources, or information for certain groups. Examples include hiring, school admissions, and lending where a model might be much better at picking good candidates among a specific group of people than among other groups.
3030

31-
- Harm of quality-of-service: An AI system does not work as well for one group of people as it does for another. As an example, a voice recognition system might fail to work as well for women as it does for men.
31+
- Harm of quality-of-service: An AI system doesn't work as well for one group of people as it does for another. As an example, a voice recognition system might fail to work as well for women as it does for men.
3232

33-
To reduce unfair behavior in AI systems, you have to assess and mitigate these harms. The model overview component of the [Responsible AI dashboard](concept-responsible-ai-dashboard.md) contributes to the “identify” stage of the model lifecycle by generating a variety of model performance metrics for your entire dataset, your identified cohorts of data, and across subgroups identified in terms of **sensitive features** or sensitive attributes.
33+
To reduce unfair behavior in AI systems, you have to assess and mitigate these harms. The model overview component of the [Responsible AI dashboard](concept-responsible-ai-dashboard.md) contributes to the “identify” stage of the model lifecycle by generating various model performance metrics for your entire dataset, your identified cohorts of data, and across subgroups identified in terms of **sensitive features** or sensitive attributes.
3434

3535
>[!NOTE]
3636
> Fairness is a socio-technical challenge. Many aspects of fairness, such as justice and due process, are not captured in quantitative fairness metrics. Also, many quantitative fairness metrics can't all be satisfied simultaneously. The goal of the Fairlearn open-source package is to enable humans to assess the different impact and mitigation strategies. Ultimately, it is up to the human users building artificial intelligence and machine learning models to make trade-offs that are appropriate to their scenario.
@@ -64,12 +64,12 @@ The fairness assessment capabilities of this component are founded by the [Fairl
6464

6565
Upon understanding your model's fairness issues, you can use [Fairlearn](https://fairlearn.org/)'s mitigation algorithms to mitigate your observed fairness issues.
6666

67-
The Fairlearn open-source package includes a variety of unfairness mitigation algorithms. These algorithms support a set of constraints on the predictor's behavior called **parity constraints** or criteria. Parity constraints require some aspects of the predictor behavior to be comparable across the groups that sensitive features define (for example, different races). The mitigation algorithms in the Fairlearn open-source package use such parity constraints to mitigate the observed fairness issues.
67+
The Fairlearn open-source package includes various unfairness mitigation algorithms. These algorithms support a set of constraints on the predictor's behavior called **parity constraints** or criteria. Parity constraints require some aspects of the predictor behavior to be comparable across the groups that sensitive features define (for example, different races). The mitigation algorithms in the Fairlearn open-source package use such parity constraints to mitigate the observed fairness issues.
6868

6969
>[!NOTE]
7070
> Mitigating unfairness in a model means reducing the unfairness, but this technical mitigation cannot eliminate this unfairness completely. The unfairness mitigation algorithms in the Fairlearn open-source package can provide suggested mitigation strategies to help reduce unfairness in a machine learning model, but they are not solutions to eliminate unfairness completely. There may be other parity constraints or criteria that should be considered for each particular developer's machine learning model. Developers using Azure Machine Learning must determine for themselves if the mitigation sufficiently eliminates any unfairness in their intended use and deployment of machine learning models.
7171
72-
The Fairlearn open-source package supports the following types of parity constraints:
72+
The Fairlearn open-source package supports the following types of parity constraints:
7373

7474
|Parity constraint | Purpose |Machine learning task |
7575
|---------|---------|---------|
@@ -82,7 +82,7 @@ The Fairlearn open-source package supports the following types of parity constra
8282

8383
The Fairlearn open-source package provides postprocessing and reduction unfairness mitigation algorithms:
8484

85-
- Reduction: These algorithms take a standard black-box machine learning estimator (for example, a LightGBM model) and generate a set of retrained models using a sequence of re-weighted training datasets. For example, applicants of a certain gender might be up-weighted or down-weighted to retrain models and reduce disparities across different gender groups. Users can then pick a model that provides the best trade-off between accuracy (or other performance metric) and disparity, which generally would need to be based on business rules and cost calculations.
85+
- Reduction: These algorithms take a standard black-box machine learning estimator (for example, a LightGBM model) and generate a set of retrained models using a sequence of reweighted training datasets. For example, applicants of a certain gender might be up-weighted or down-weighted to retrain models and reduce disparities across different gender groups. Users can then pick a model that provides the best trade-off between accuracy (or other performance metric) and disparity, which generally would need to be based on business rules and cost calculations.
8686
- Post-processing: These algorithms take an existing classifier and the sensitive feature as input. Then, they derive a transformation of the classifier's prediction to enforce the specified fairness constraints. The biggest advantage of threshold optimization is its simplicity and flexibility as it doesn’t need to retrain the model.
8787

8888
| Algorithm | Description | Machine learning task | Sensitive features | Supported parity constraints | Algorithm Type |
@@ -95,6 +95,6 @@ The Fairlearn open-source package provides postprocessing and reduction unfairne
9595
## Next steps
9696

9797
- Learn how to generate the Responsible AI dashboard via [CLIv2 and SDKv2](how-to-responsible-ai-dashboard-sdk-cli.md) or [studio UI](how-to-responsible-ai-dashboard-ui.md).
98-
- Explore the [supported model overview and fairness assessment visualizations](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-responsible-ai-dashboard#model-overview) of the Responsible AI dashboard.
98+
- Explore the [supported model overview and fairness assessment visualizations](how-to-responsible-ai-dashboard.md#model-overview) of the Responsible AI dashboard.
9999
- Learn how to generate a [Responsible AI scorecard](how-to-responsible-ai-scorecard.md) based on the insights observed in the Responsible AI dashboard.
100-
- Learn how to use the different components by checking out the Fairlearn's [GitHub](https://github.com/fairlearn/fairlearn/), [user guide](https://fairlearn.github.io/main/user_guide/index.html), [examples](https://fairlearn.github.io/main/auto_examples/index.html), and [sample notebooks](https://github.com/fairlearn/fairlearn/tree/master/notebooks).
100+
- Learn how to use the different components by checking out the [Fairlearn's GitHub](https://github.com/fairlearn/fairlearn/), [user guide](https://fairlearn.github.io/main/user_guide/index.html), [examples](https://fairlearn.github.io/main/auto_examples/index.html), and [sample notebooks](https://github.com/fairlearn/fairlearn/tree/master/notebooks).

0 commit comments

Comments
 (0)