You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/english/technical-tools/BDT.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -81,17 +81,17 @@ team:
81
81
82
82
#### What does the tool do?
83
83
84
-
The tool helps find groups where an AI system or algorithm performs differently, which could indicate unfair treatment. This type of monitoring is called *anomaly detection*. It detect deviations using a technique called <ahref="https://en.wikipedia.org/wiki/Cluster_analysis"target="_blank">clustering</a>, which groups similar data points together (in clusters). The tool doesn’t need information like gender, nationality, or ethnicity to find deviations. Instead, it uses an `outcome label` to measure deviations in the performace of the system, which you can choose based on your data.
84
+
The tool helps find groups where an AI system or algorithm performs differently, which could indicate unfair treatment. This type of monitoring is called *anomaly detection*. It detect deviations using a technique called <ahref="https://en.wikipedia.org/wiki/Cluster_analysis"target="_blank">clustering</a>, which groups similar data points together (in clusters). The tool doesn’t need information like gender, nationality, or ethnicity to find deviations. Instead, it uses an `bias variable` to measure deviations in the performace of the system, which you can choose based on your data.
85
85
86
86
#### What results does it give?
87
-
The tool finds groups (clusters) where performance of the algorithmic system is significantly deviating. It highlights the group with the worst outcome labels and creates a report called a bias analysis report, which you can download as a PDF. You can also download all the identified groups (clusters) in a .json file. Additionally, the tool provides visual summaries of the results, helping experts dive deeper into the identified deviations. An example can be found below. {{< tooltip tooltip_content="The figure below shows that cluster 0, the cluster with systemic deviating outcome labels, includes a higher-than-average proportion of African-American and a lower-than-average proportion of Caucasian people. For other demographic groups, cluster 0 reflects an average distribution. Additional details about this example are available in the demo dataset." >}}
87
+
The tool finds groups (clusters) where performance of the algorithmic system is significantly deviating. It highlights the group with the worst bias variables and creates a report called a bias analysis report, which you can download as a PDF. You can also download all the identified groups (clusters) in a .json file. Additionally, the tool provides visual summaries of the results, helping experts dive deeper into the identified deviations. An example can be found below. {{< tooltip tooltip_content="The figure below shows that cluster 0, the cluster with systemic deviating bias variable, includes a higher-than-average proportion of African-American and a lower-than-average proportion of Caucasian people. For other demographic groups, cluster 0 reflects an average distribution. Additional details about this example are available in the demo dataset." >}}
The tool works with data in a table format, consisting solely of numbers or categories. You need to pick one column in the data to use as the `outcome label`. This column should have numbers only, and you’ll specify whether a higher or lower number is better. For example, if you’re looking at error rates, lower numbers are better. For accuracy, higher numbers are better. The tool also comes with a demo dataset you can use by clicking "Demo dataset".
94
+
The tool works with data in a table format, consisting solely of numbers or categories. You need to pick one column in the data to use as the `bias variable`. This column should have numbers only, and you’ll specify whether a higher or lower number is better. For example, if you’re looking at error rates, lower numbers are better. For accuracy, higher numbers are better. The tool also comes with a demo dataset you can use by clicking "Demo dataset".
95
95
96
96
<div>
97
97
<p><u>Example of numerical dataset</u>:</p>
@@ -133,22 +133,22 @@ The unsupervised bias detection tool performs a series of steps:
133
133
##### Required preparations by the user:
134
134
<spanstyle="color:#005AA7">Step 1. Data:</span> the user shoulds prepare the following aspects relating to the processed data:
135
135
- <spanstyle="color:#005AA7">Dataset:</span> The data must be provided in a tabular format. Any missing values should be removed or replaced.
136
-
- <spanstyle="color:#005AA7">Type of data:</span> All columns, except the outcome label column, should have uniform data types, e.g., either all numerical or all categorical. The user selects whether numerical of categorical data are processed.
137
-
- <spanstyle="color:#005AA7">Outcome label:</span> A column should be selected from the dataset to serve as the `outcome label`, which needs to be numerical. In step 4, clustering will be performed based on these numerical values. Examples include metrics such as "being classified as high risk", "error rate" or "selected for an investigation".
136
+
- <spanstyle="color:#005AA7">Type of data:</span> All columns, except the bias variable column, should have uniform data types, e.g., either all numerical or all categorical. The user selects whether numerical of categorical data are processed.
137
+
- <spanstyle="color:#005AA7">Bias variable:</span> A column should be selected from the dataset to serve as the `bias variable`, which needs to be numerical. In step 4, clustering will be performed based on these numerical values. Examples include metrics such as "being classified as high risk", "error rate" or "selected for an investigation".
138
138
139
139
<spanstyle="color:#005AA7">Step 2. Parameters:</span> the user shoulds set the following hyperparameters:
140
140
- <spanstyle="color:#005AA7">Iterations:</span> How often the data are allowed to be split in smaller clusters, by default 10 iterations are selected.
141
141
- <spanstyle="color:#005AA7">Minimal cluster size:</span> How many datapoints the identified clusters may contain, by deafault set to 1% of the number of rows in the attached dataset. More guidance on well-informed choice of the minimal cluster size can be found in section 3.3 of our [scientific paper](/technical-tools/bdt/#scientific-paper).
142
-
- <spanstyle="color:#005AA7">Outcome label interpretation:</span> How the outcome label should be interpreted. For instance, when error rate or misclassifications are chosen as the outcome label, a lower value is preferred, as the goal is to minimize errors. Conversely, when accuracy or precision is selected as the outcome label, a higher value is preferred, reflecting the aim to maximize performance.
142
+
- <spanstyle="color:#005AA7">Bias variable interpretation:</span> How the bias variable should be interpreted. For instance, when error rate or misclassifications are chosen as the bias variable, a lower value is preferred, as the goal is to minimize errors. Conversely, when accuracy or precision is selected as the bias variable, a higher value is preferred, reflecting the aim to maximize performance.
143
143
144
144
##### Performed by the tool:
145
145
<spanstyle="color:#005AA7">Step 3. Train-test data:</span> The dataset is divided into train and test subset, following a 80-20 ratio.
146
146
147
147
<spanstyle="color:#005AA7">Step 4. Hierarchical Bias-Aware Clustering (HBAC):</span> The HBAC algorithm (detailed below) is applied to the train dataset. The centroids of the resulting clusters are saved and later used to assign cluster labels to data points in the test dataset.
148
148
149
-
<spanstyle="color:#005AA7">Step 5. Testing cluster differences wrt. outcome labels:</span> Statistical hypothesis testing is performed to evaluate whether the outcome labels differ significantly in the most deviating cluster compared to the rest of the dataset. A t-test is used to compare the means of the outcome labels.
149
+
<spanstyle="color:#005AA7">Step 5. Testing cluster differences wrt. bias variable:</span> Statistical hypothesis testing is performed to evaluate whether the bias variable differ significantly in the most deviating cluster compared to the rest of the dataset. A t-test is used to compare the means of the bias variable.
150
150
151
-
<spanstyle="color:#005AA7">Step 6. Testing cluster differences wrt. features:</span> If a statistically significant difference in outcome labels between the most deviating cluster and the rest of the dataset occurs, feature diffences are examined. For this, also statistical hypothesis testing is used, namely a t-test in case numercial data and Pearson’s 𝜒2-test in case categorical data are processed. For multiple hypothesis testing, Bonferonni correction should be applied. Further details can be found in section 3.4 of our [scientific paper](/technical-tools/bdt/#scientific-paper).
151
+
<spanstyle="color:#005AA7">Step 6. Testing cluster differences wrt. features:</span> If a statistically significant difference in bias variable between the most deviating cluster and the rest of the dataset occurs, feature diffences are examined. For this, also statistical hypothesis testing is used, namely a t-test in case numercial data and Pearson’s 𝜒2-test in case categorical data are processed. For multiple hypothesis testing, Bonferonni correction should be applied. Further details can be found in section 3.4 of our [scientific paper](/technical-tools/bdt/#scientific-paper).
152
152
153
153
A schematic overview of the above steps is depicted below.
154
154
@@ -157,7 +157,7 @@ A schematic overview of the above steps is depicted below.
157
157
</div>
158
158
159
159
#### How does the clustering algorithm work?
160
-
The *Hierarchical Bias-Aware Clustering* (HBAC) algorithm identifies clusters in the provided dataset based on a user-defined `outcome label`. The objective is to find clusters with low variation in the outcome labels within each cluster. Variation in the outcome labels between clusters should be high. HBAC iteratively finds clusters in the data using k-means (for numerical data) or k-modes clustering (for categorical data). For the initial split, HBAC takes the full dataset and splits it in two clusters. Cluster `C` – with the highest standard deviation of the outcome label – is selected. Then, cluster `C` is divided into two candidate clusters `C'` and `C''`'. If the average outcome label score in either candidate cluster exceed the the average outcome label score in `C`, the candidate cluster with highest outcome label score is selected as a new cluster. This process repeats until the maximum number of iterations (`max_iterations`) is reached or the resulting cluster fails to meet the minimum size requirement (`n_min`). The pseudo-code of the HBAC algorithm is provided below.
160
+
The *Hierarchical Bias-Aware Clustering* (HBAC) algorithm identifies clusters in the provided dataset based on a user-defined `bias variable`. The objective is to find clusters with low variation in the bias variable within each cluster. Variation in the bias variable between clusters should be high. HBAC iteratively finds clusters in the data using k-means (for numerical data) or k-modes clustering (for categorical data). For the initial split, HBAC takes the full dataset and splits it in two clusters. Cluster `C` – with the highest standard deviation of the bias variable – is selected. Then, cluster `C` is divided into two candidate clusters `C'` and `C''`'. If the average bias variable in either candidate cluster exceed the the average bias variable in `C`, the candidate cluster with highest bias variable is selected as a new cluster. This process repeats until the maximum number of iterations (`max_iterations`) is reached or the resulting cluster fails to meet the minimum size requirement (`n_min`). The pseudo-code of the HBAC algorithm is provided below.
@@ -166,7 +166,7 @@ The *Hierarchical Bias-Aware Clustering* (HBAC) algorithm identifies clusters in
166
166
The HBAC-algorithm is introduced by Misztal-Radecka and Indurkya in a [scientific article](https://www.sciencedirect.com/science/article/abs/pii/S0306457321000285) as published in *Information Processing and Management* in 2021. Our implementation of the HBAC-algorithm advances this implementation by proposing additional methodological checks to distinguish real singals from noise, such as sample splitting, statistical hypothesis testing and measuring cluster stability. Algorithm Audit's implementation of the algorithm can be found in the <ahref="https://github.com/NGO-Algorithm-Audit/unsupervised-bias-detection/blob/master/README.md"target="_blank">unsupervised-bias-detection</a> pip package.
167
167
168
168
#### How should the results of the tool be interpreted?
169
-
The HBAC algorithm maximizes the difference in outcome labels between clusters. To prevent incorrect conclusions that there are unwanted deviations in the decision-making process under review when there truly is none, we split the dataset in training and test data, and hypothesis testing prevents us from (wrongly) concluding that there is a difference in outcome labels while there is none. If a statistically significant deviation is detected, the outcome of the tool serves as a starting point for human experts to assess the identified deviations in the decision-making processes.
169
+
The HBAC algorithm maximizes the difference in bias variable between clusters. To prevent incorrect conclusions that there are unwanted deviations in the decision-making process under review when there truly is none, we split the dataset in training and test data, and hypothesis testing prevents us from (wrongly) concluding that there is a difference in bias variable while there is none. If a statistically significant deviation is detected, the outcome of the tool serves as a starting point for human experts to assess the identified deviations in the decision-making processes.
0 commit comments