Skip to content

Commit 00fa7d9

Browse files
authored
Merge pull request #217549 from lgayhardt/rai112022
RAI GA
2 parents c2e91c7 + 260aa66 commit 00fa7d9

File tree

63 files changed

+561
-629
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+561
-629
lines changed

articles/machine-learning/.openpublishing.redirection.machine-learning.json

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3524,6 +3524,21 @@
35243524
"source_path_from_root": "/articles/machine-learning/how-to-troubleshoot-batch-endpoints.md",
35253525
"redirect_url": "/azure/machine-learning/batch-inference/how-to-troubleshoot-batch-endpoints",
35263526
"redirect_document_id": true
3527-
}
3527+
},
3528+
{
3529+
"source_path_from_root": "/articles/machine-learning/concept-responsible-ml.md",
3530+
"redirect_url": "/azure/machine-learning/concept-responsible-ai",
3531+
"redirect_document_id": true
3532+
},
3533+
{
3534+
"source_path_from_root": "/articles/machine-learning/how-to-responsible-ai-dashboard-ui.md",
3535+
"redirect_url": "/azure/machine-learning/how-to-responsible-ai-insights-ui",
3536+
"redirect_document_id": true
3537+
},
3538+
{
3539+
"source_path_from_root": "/articles/machine-learning/how-to-responsible-ai-dashboard-sdk-cli.md",
3540+
"redirect_url": "/azure/machine-learning/how-to-responsible-ai-insights-sdk-cli",
3541+
"redirect_document_id": true
3542+
},
35283543
]
35293544
}
Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,29 @@
11
---
22
title: Understand your datasets
33
titleSuffix: Azure Machine Learning
4-
description: Perform exploratory data analysis to understand feature biases and imbalances by using the Responsible AI dashboard's data explorer.
4+
description: Perform exploratory data analysis to understand feature biases and imbalances by using the Responsible AI dashboard's data analysis.
55
services: machine-learning
66
ms.service: machine-learning
77
ms.subservice: enterprise-readiness
88
ms.topic: how-to
99
ms.author: mesameki
1010
author: mesameki
1111
ms.reviewer: lagayhar
12-
ms.date: 08/17/2022
12+
ms.date: 11/09/2022
1313
ms.custom: responsible-ml, event-tier1-build-2022
1414
---
1515

1616
# Understand your datasets (preview)
1717

18-
Machine learning models "learn" from historical decisions and actions captured in training data. As a result, their performance in real-world scenarios is heavily influenced by the data they're trained on. When feature distribution in a dataset is skewed, it can cause a model to incorrectly predict data points that belong to an underrepresented group or to be optimized along an inappropriate metric.
18+
Machine learning models "learn" from historical decisions and actions captured in training data. As a result, their performance in real-world scenarios is heavily influenced by the data they're trained on. When feature distribution in a dataset is skewed, it can cause a model to incorrectly predict data points that belong to an underrepresented group or to be optimized along an inappropriate metric.
1919

2020
For example, while a model was training an AI system for predicting house prices, the training set was representing 75 percent of newer houses that had less than median prices. As a result, it was much less accurate in successfully identifying more expensive historic houses. The fix was to add older and expensive houses to the training data and augment the features to include insights about historical value. That data augmentation improved results.
2121

22-
The data explorer component of the [Responsible AI dashboard](concept-responsible-ai-dashboard.md) helps visualize datasets based on predicted and actual outcomes, error groups, and specific features. It helps you identify issues of overrepresentation and underrepresentation and to see how data is clustered in the dataset. Data visualizations consist of aggregate plots or individual data points.
22+
The data analysis component of the [Responsible AI dashboard](concept-responsible-ai-dashboard.md) helps visualize datasets based on predicted and actual outcomes, error groups, and specific features. It helps you identify issues of overrepresentation and underrepresentation and to see how data is clustered in the dataset. Data visualizations consist of aggregate plots or individual data points.
2323

24-
## When to use the data explorer
24+
## When to use data analysis
2525

26-
Use the data explorer when you need to:
26+
Use data analysis when you need to:
2727

2828
- Explore your dataset statistics by selecting different filters to slice your data into different dimensions (also known as cohorts).
2929
- Understand the distribution of your dataset across different cohorts and feature groups.
@@ -32,6 +32,6 @@ Use the data explorer when you need to:
3232

3333
## Next steps
3434

35-
- Learn how to generate the Responsible AI dashboard via [CLI and SDK](how-to-responsible-ai-dashboard-sdk-cli.md) or [Azure Machine Learning studio UI](how-to-responsible-ai-dashboard-ui.md).
36-
- Explore the [supported data explorer visualizations](how-to-responsible-ai-dashboard.md#data-explorer) of the Responsible AI dashboard.
35+
- Learn how to generate the Responsible AI dashboard via [CLI and SDK](how-to-responsible-ai-insights-sdk-cli.md) or [Azure Machine Learning studio UI](how-to-responsible-ai-insights-ui.md).
36+
- Explore the [supported data analysis visualizations](how-to-responsible-ai-dashboard.md#data-analysis) of the Responsible AI dashboard.
3737
- Learn how to generate a [Responsible AI scorecard](how-to-responsible-ai-scorecard.md) based on the insights observed in the Responsible AI dashboard.

articles/machine-learning/concept-fairness-ml.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ The Fairlearn open-source package provides two types of unfairness mitigation al
9090

9191
## Next steps
9292

93-
- Learn how to generate the Responsible AI dashboard via [CLI and SDK](how-to-responsible-ai-dashboard-sdk-cli.md) or [Azure Machine Learning studio UI](how-to-responsible-ai-dashboard-ui.md).
94-
- Explore the [supported model overview and fairness assessment visualizations](how-to-responsible-ai-dashboard.md#model-overview) of the Responsible AI dashboard.
93+
- Learn how to generate the Responsible AI dashboard via [CLI and SDK](how-to-responsible-ai-insights-sdk-cli.md) or [Azure Machine Learning studio UI](how-to-responsible-ai-insights-ui.md).
94+
- Explore the [supported model overview and fairness assessment visualizations](how-to-responsible-ai-dashboard.md#model-overview-and-fairness-metrics) of the Responsible AI dashboard.
9595
- Learn how to generate a [Responsible AI scorecard](how-to-responsible-ai-scorecard.md) based on the insights observed in the Responsible AI dashboard.
9696
- Learn how to use the components by checking out Fairlearn's [GitHub repository](https://github.com/fairlearn/fairlearn/), [user guide](https://fairlearn.github.io/main/user_guide/index.html), [examples](https://fairlearn.github.io/main/auto_examples/index.html), and [sample notebooks](https://github.com/fairlearn/fairlearn/tree/master/notebooks).

articles/machine-learning/concept-responsible-ai-dashboard.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.topic: how-to
99
ms.author: mesameki
1010
author: mesameki
1111
ms.reviewer: lagayhar
12-
ms.date: 08/17/2022
12+
ms.date: 11/09/2022
1313
ms.custom: responsible-ml, event-tier1-build-2022
1414
---
1515

@@ -24,15 +24,15 @@ The Responsible AI dashboard provides a single interface to help you implement R
2424
- [Machine learning interpretability](https://interpret.ml/)
2525
- [Error analysis](https://erroranalysis.ai/)
2626
- [Counterfactual analysis and perturbations](https://github.com/interpretml/DiCE)
27-
- [Causal inference](https://github.com/microsoft/EconML)
27+
- [Causal inference](https://github.com/microsoft/EconML)
2828

2929
The dashboard offers a holistic assessment and debugging of models so you can make informed data-driven decisions. Having access to all of these tools in one interface empowers you to:
3030

3131
- Evaluate and debug your machine learning models by identifying model errors and fairness issues, diagnosing why those errors are happening, and informing your mitigation steps.
3232
- Boost your data-driven decision-making abilities by addressing questions such as:
33-
33+
3434
"What is the minimum change that users can apply to their features to get a different outcome from the model?"
35-
35+
3636
"What is the causal effect of reducing or increasing a feature (for example, red meat consumption) on a real-world outcome (for example, diabetes progression)?"
3737

3838
You can customize the dashboard to include only the subset of tools that are relevant to your use case.
@@ -43,7 +43,7 @@ The Responsible AI dashboard is accompanied by a [PDF scorecard](how-to-responsi
4343

4444
The Responsible AI dashboard brings together, in a comprehensive view, various new and pre-existing tools. The dashboard integrates these tools with [Azure Machine Learning CLI v2, Azure Machine Learning Python SDK v2](concept-v2.md), and [Azure Machine Learning studio](overview-what-is-azure-machine-learning.md#studio). The tools include:
4545

46-
- [Data explorer](concept-data-analysis.md), to understand and explore your dataset distributions and statistics.
46+
- [Data analysis](concept-data-analysis.md), to understand and explore your dataset distributions and statistics.
4747
- [Model overview and fairness assessment](concept-fairness-ml.md), to evaluate the performance of your model and evaluate your model's group fairness issues (how your model's predictions affect diverse groups of people).
4848
- [Error analysis](concept-error-analysis.md), to view and understand how errors are distributed in your dataset.
4949
- [Model interpretability](how-to-machine-learning-interpretability.md) (importance values for aggregate and individual features), to understand your model's predictions and how those overall and individual predictions are made.
@@ -83,7 +83,7 @@ The following table describes when to use Responsible AI dashboard components to
8383
| Identify | Error analysis | The error analysis component helps you get a deeper understanding of model failure distribution and quickly identify erroneous cohorts (subgroups) of data. <br><br> The capabilities of this component in the dashboard come from the [Error Analysis](https://erroranalysis.ai/) package.|
8484
| Identify | Fairness analysis | The fairness component defines groups in terms of sensitive attributes such as sex, race, and age. It then assesses how your model predictions affect these groups and how you can mitigate disparities. It evaluates the performance of your model by exploring the distribution of your prediction values and the values of your model performance metrics across the groups. <br><br>The capabilities of this component in the dashboard come from the [Fairlearn](https://fairlearn.org/) package. |
8585
| Identify | Model overview | The model overview component aggregates model assessment metrics in a high-level view of model prediction distribution for better investigation of its performance. This component also enables group fairness assessment by highlighting the breakdown of model performance across sensitive groups. |
86-
| Diagnose | Data explorer | The data explorer visualizes datasets based on predicted and actual outcomes, error groups, and specific features. You can then identify issues of overrepresentation and underrepresentation, along with seeing how data is clustered in the dataset. |
86+
| Diagnose | Data analysis | Data analysis visualizes datasets based on predicted and actual outcomes, error groups, and specific features. You can then identify issues of overrepresentation and underrepresentation, along with seeing how data is clustered in the dataset. |
8787
| Diagnose | Model interpretability | The interpretability component generates human-understandable explanations of the predictions of a machine learning model. It provides multiple views into a model's behavior: <br> - Global explanations (for example, which features affect the overall behavior of a loan allocation model) <br> - Local explanations (for example, why an applicant's loan application was approved or rejected) <br><br> The capabilities of this component in the dashboard come from the [InterpretML](https://interpret.ml/) package. |
8888
| Diagnose | Counterfactual analysis and what-if| This component consists of two functionalities for better error diagnosis: <br> - Generating a set of examples in which minimal changes to a particular point alter the model's prediction. That is, the examples show the closest data points with opposite model predictions. <br> - Enabling interactive and custom what-if perturbations for individual data points to understand how the model reacts to feature changes. <br> <br> The capabilities of this component in the dashboard come from the [DiCE](https://github.com/interpretml/DiCE) package. |
8989

@@ -108,7 +108,7 @@ Exploratory data analysis, causal inference, and counterfactual analysis capabil
108108

109109
These components of the Responsible AI dashboard support responsible decision-making:
110110

111-
- **Data explorer**: You can reuse the data explorer component here to understand data distributions and to identify overrepresentation and underrepresentation. Data exploration is a critical part of decision making, because it isn't feasible to make informed decisions about a cohort that's underrepresented in the data.
111+
- **Data analysis**: You can reuse the data analysis component here to understand data distributions and to identify overrepresentation and underrepresentation. Data exploration is a critical part of decision making, because it isn't feasible to make informed decisions about a cohort that's underrepresented in the data.
112112
- **Causal inference**: The causal inference component estimates how a real-world outcome changes in the presence of an intervention. It also helps construct promising interventions by simulating feature responses to various interventions and creating rules to determine which population cohorts would benefit from a particular intervention. Collectively, these functionalities allow you to apply new policies and effect real-world change.
113113

114114
The capabilities of this component come from the [EconML](https://github.com/Microsoft/EconML) package, which estimates heterogeneous treatment effects from observational data via machine learning.
@@ -142,18 +142,18 @@ Need some inspiration? Here are some examples of how the dashboard's components
142142

143143
| Responsible AI dashboard flow | Use case |
144144
|-------------------------------|----------|
145-
| Model overview > error analysis > data explorer | To identify model errors and diagnose them by understanding the underlying data distribution |
146-
| Model overview > fairness assessment > data explorer | To identify model fairness issues and diagnose them by understanding the underlying data distribution |
145+
| Model overview > error analysis > data analysis | To identify model errors and diagnose them by understanding the underlying data distribution |
146+
| Model overview > fairness assessment > data analysis | To identify model fairness issues and diagnose them by understanding the underlying data distribution |
147147
| Model overview > error analysis > counterfactuals analysis and what-if | To diagnose errors in individual instances with counterfactual analysis (minimum change to lead to a different model prediction) |
148-
| Model overview > data explorer | To understand the root cause of errors and fairness issues introduced via data imbalances or lack of representation of a particular data cohort |
148+
| Model overview > data analysis | To understand the root cause of errors and fairness issues introduced via data imbalances or lack of representation of a particular data cohort |
149149
| Model overview > interpretability | To diagnose model errors through understanding how the model has made its predictions |
150-
| Data explorer > causal inference | To distinguish between correlations and causations in the data or decide the best treatments to apply to get a positive outcome |
150+
| Data analysis > causal inference | To distinguish between correlations and causations in the data or decide the best treatments to apply to get a positive outcome |
151151
| Interpretability > causal inference | To learn whether the factors that the model has used for prediction-making have any causal effect on the real-world outcome|
152-
| Data explorer > counterfactuals analysis and what-if | To address customers' questions about what they can do next time to get a different outcome from an AI system|
152+
| Data analysis > counterfactuals analysis and what-if | To address customers' questions about what they can do next time to get a different outcome from an AI system|
153153

154154
## People who should use the Responsible AI dashboard
155155

156-
The following people can use the Responsible AI dashboard, and its corresponding [Responsible AI scorecard](how-to-responsible-ai-scorecard.md), to build trust with AI systems:
156+
The following people can use the Responsible AI dashboard, and its corresponding [Responsible AI scorecard](concept-responsible-ai-scorecard.md), to build trust with AI systems:
157157

158158
- Machine learning professionals and data scientists who are interested in debugging and improving their machine learning models before deployment
159159
- Machine learning professionals and data scientists who are interested in sharing their model health records with product managers and business stakeholders to build trust and receive deployment permissions
@@ -174,5 +174,5 @@ The following people can use the Responsible AI dashboard, and its corresponding
174174

175175
## Next steps
176176

177-
- Learn how to generate the Responsible AI dashboard via [CLI and SDK](how-to-responsible-ai-dashboard-sdk-cli.md) or [Azure Machine Learning studio UI](how-to-responsible-ai-dashboard-ui.md).
178-
- Learn how to generate a [Responsible AI scorecard](how-to-responsible-ai-scorecard.md) based on the insights observed on the Responsible AI dashboard.
177+
- Learn how to generate the Responsible AI dashboard via [CLI and SDK](how-to-responsible-ai-insights-sdk-cli.md) or [Azure Machine Learning studio UI](how-to-responsible-ai-insights-ui.md).
178+
- Learn how to generate a [Responsible AI scorecard](concept-responsible-ai-scorecard.md) based on the insights observed on the Responsible AI dashboard.

0 commit comments

Comments
 (0)