Skip to content

Commit c01f6ad

Browse files
committed
Subitems in quick navigation bar
1 parent 81f113f commit c01f6ad

File tree

4 files changed

+35
-14
lines changed

4 files changed

+35
-14
lines changed

content/english/technical-tools/BDT.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
---
22
title: Unsupervised bias detection tool
33
subtitle: >
4-
A statistical tool that identifies groups where an AI system or algorithm shows deviating performance, potentially indicating unfair treatment. The tool informs which disparities need to be examimed manually by domain experts.
4+
A statistical tool that identifies groups where an AI system or algorithm
5+
shows deviating performance, potentially indicating unfair treatment. The tool
6+
informs which disparities need to be examimed manually by domain experts.
57
image: /images/svg-illustrations/illustration_cases.svg
68
quick_navigation:
79
title: Content overview
@@ -10,6 +12,7 @@ quick_navigation:
1012
url: '#info'
1113
- title: Web app
1214
url: '#web-app'
15+
indent: 1
1316
- title: Source code
1417
url: '#source-code'
1518
- title: Scientific paper and audit report
@@ -99,16 +102,19 @@ type: bias-detection-tool
99102
<br>
100103

101104
#### What does the tool do?
105+
102106
The tool helps find groups where an AI system or algorithm performs differently, which could indicate unfair treatment. It does this using a technique called <a href="https://en.wikipedia.org/wiki/Cluster_analysis" target="_blank">clustering</a>, which groups similar data points together (in a cluster). The tool doesn’t need information like gender, nationality, or ethnicity to find these patterns. Instead, it uses a `bias score` to measure deviations in the performace of the system, which you can choose based on your data.
103107

104108
#### What results does it give?
109+
105110
The tool finds groups (clusters) where performance of the algorithmic system is significantly deviating. It highlights the group with the worst `bias score` and creates a report called a bias analysis report, which you can download as a PDF. You can also download all the identified groups (clusters) in a .json file. Additionally, the tool provides visual summaries of the results, helping experts dive deeper into the identified deviations. Example below. {{< tooltip tooltip_content="The figure below shows that cluster 0, the cluster with highest bias score, includes a higher-than-average proportion of African-American and a lower-than-average proportion of Caucasian people. For other demographic groups, cluster 0 reflects an average distribution. Additional details about this example are available in the demo dataset." >}}
106111

107112
<div style="margin-bottom:50px; display: flex; justify-content: center;">
108113
<img src="/images/BDT/example_COMPAS.png" alt="drawing" width="600px"/>
109114
</div>
110115

111116
#### What kind of data does it work with?
117+
112118
The tool works with data in a table format, consisting solely of numbers or categories. You just need to pick one column in the data to use as the `bias score`. This column should have numbers only, and you’ll specify whether a higher or lower number is better. For example, if you’re looking at error rates, lower numbers are better. For accuracy, higher numbers are better. The tool also comes with a demo dataset you can use by clicking "Try it out."
113119

114120
<div>
@@ -131,6 +137,7 @@ The tool works with data in a table format, consisting solely of numbers or cate
131137
<br>
132138

133139
#### Is my data safe?
140+
134141
Yes! Your data stays on your computer and never leaves your organization’s environment. The tool runs directly in your browser, using your computer’s power to analyze the data. This setup, called 'local-only', ensures no data is sent to cloud providers or third parties. Instructions for hosting the tool securely within your organization are available on <a href="https://github.com/NGO-Algorithm-Audit/local-only-web-tool" target="_blank">Github</a>.
135142

136143
Try the tool below ⬇️
@@ -144,14 +151,17 @@ Try the tool below ⬇️
144151
<br>
145152

146153
#### Which steps does the tool undertake?
154+
147155
The unsupervised bias detection tool operates a series of steps:
148156

149157
##### Prepared by the user:
158+
150159
<span style="color:#005AA7">1. Dataset:</span> The data must be provided in a tabular format. All columns, except the bias score column, should have uniform data types, e.g., either all numerical or all categorical. The bias score column must be numerical. Any missing values should be removed or replaced. The dataset should then be divided into training and testing subset, following a 80-20 ratio.
151160

152161
<span style="color:#005AA7">2. bias score:</span> The user selects one column from the dataset to serve as the `bias score`. In step 3, clustering will be performed based on this chosen `bias score`. The chosen bias score must be numerical. Examples include metrics such as "being classified as high risk", "error rate" or "selected for an investigation".
153162

154163
##### Performed by the tool:
164+
155165
<span style="color:#005AA7">3. Hierarchical Bias-Aware Clustering (HBAC):</span> The HBAC algorithm (detailed below) is applied to the training dataset. The centroids of the resulting clusters are saved and later used to assign cluster labels to data points in the test dataset.
156166

157167
<span style="color:#005AA7">4. Testing differences in bias score:</span> Statistical hypothesis testing is performed to evaluate whether the most deviating cluster contains significantly more bias compared to the rest of the dataset. A two-sample t-test is used to compare the bias scores between clusters. For multiple hypothesis testing, Bonferonni correction should be applied. Further details can are available in our [scientific paper](/technical-tools/bdt/#scientific-paper).
@@ -163,6 +173,7 @@ A schematic overview of the above steps is depicted below.
163173
</div>
164174

165175
#### How does the clustering algorithm work?
176+
166177
The *Hierarchical Bias-Aware Clustering* (HBAC) algorithm identifies clusters in the provided dataset based on a user-defined `bias score`. The objective is to find clusters with low variation in the bias score within each cluster and significant variation between clusters. HBAC iteratively finds clusters in the data using k-means (for numerical data) or k-modes clustering (for categorical data). For the initial split, HBAC takes the full dataset and splits it in two clusters. Cluster `C` – with the highest standard deviation of the bias score – is selected. Then, cluster `C` is divided into two candidate clusters `C'` and `C''`'. If the average bias score in either candidate cluster exceed the the average bias score in `C`, the candidate cluster with highest bias score is selected as a new cluster. This process repeats until the maximum number of iterations (`max_iterations`) is reached or the resulting cluster fails to meet the minimum size requirement (`n_min`). The pseudo-code of the HBAC algorithm is provided below.
167178

168179
<div style="display: flex; justify-content: center;">
@@ -172,7 +183,8 @@ The *Hierarchical Bias-Aware Clustering* (HBAC) algorithm identifies clusters in
172183
The HBAC-algorithm is introduced by Misztal-Radecka and Indurkya in a [scientific article](https://www.sciencedirect.com/science/article/abs/pii/S0306457321000285) as published in *Information Processing and Management* in 2021. Our implementation of the HBAC-algorithm advances this implementation by proposing additional methodological checks to distinguish real bias from noise, such as sample splitting, statistical hypothesis testing and measuring cluster stability. Algorithm Audit's implementation of the algorithm can be found in the <a href="https://github.com/NGO-Algorithm-Audit/unsupervised-bias-detection/blob/master/README.md" target="_blank">unsupervised-bias-detection</a> pip package.
173184

174185
#### How should the results of the tool be interpreted?
175-
The HBAC algorithm maximizes the difference in the bias score between clusters. To prevent incorrect conclusions that there is bias in the decision-making process under review when there truly is none, we split the dataset in training and test data, and hypothesis testing prevents us from (wrongly) concluding that there is a difference in the bias score while there is none. If statistically significant bias is detected, the outcome of the tool serves as a starting point for human experts to assess potential discrimination in the decision-making processes.
186+
187+
The HBAC algorithm maximizes the difference in the bias score between clusters. To prevent incorrect conclusions that there is bias in the decision-making process under review when there truly is none, we split the dataset in training and test data, and hypothesis testing prevents us from (wrongly) concluding that there is a difference in the bias score while there is none. If statistically significant bias is detected, the outcome of the tool serves as a starting point for human experts to assess potential discrimination in the decision-making processes.
176188

177189
{{< container_close >}}
178190

@@ -189,7 +201,6 @@ The HBAC algorithm maximizes the difference in the bias score between clusters.
189201
{{< container_open title="Source code" id="source-code" icon="fas fa-toolbox" >}}
190202

191203
* The source code of the anolamy detection-algorithm is available on <a href="https://github.com/NGO-Algorithm-Audit/unsupervised-bias-detection" target="_blank">Github</a> and as a <a href="https://pypi.org/project/unsupervised-bias-detection/" target="_blank">pip package</a>: `pip install unsupervised-bias-detection`.
192-
193204
* The architecture to run web apps local-only is also available on <a href="https://github.com/NGO-Algorithm-Audit/local-only-web-tool" target="_blank">Github</a>.
194205

195206
{{< container_close >}}
@@ -211,6 +222,7 @@ The unsupervised bias detection tool has been applied in practice to audit a Dut
211222
<br>
212223

213224
#### What is local-only?
225+
214226
local-only computing is the opposite of cloud computing: the data is not uploaded to third-parties, such as a cloud providers, but is processed by your own computer. The data attached to the tool therefore doesn't leave your computer or the environment of your organization. The tool is privacy-friendly because the data can be processed within the mandate of your organisation and doesn't need to be shared with new parties. The unsupervised bias detection tool can also be hosted locally within your organization. Instructions, including the source code or the web app, can be found on <a href="https://github.com/NGO-Algorithm-Audit/local-only-web-tool" target="_blank">Github</a>.
215227

216228
#### Overview of local-only architecture

themes/bigspring-light/layouts/_default/single.html

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,11 +37,17 @@ <h2 class="mb-4" style="color: #005aa7;">{{ .Title }}</h2>
3737
<h3>{{.Params.quick_navigation.title}}</h3>
3838
<ul>
3939
{{ range .Params.quick_navigation.links }}
40-
<li class="quick_navigation_li d-flex align-items-center mb-0">
41-
<a href='{{.url}}' class="highlight-potential-sm d-inline-block ml-3">
42-
{{.title}}
43-
</a>
44-
</li>
40+
{{ range $index, $num := (seq .indent) }}
41+
<div class="ml-4">
42+
{{ end }}
43+
<li class="quick_navigation_li d-flex align-items-center mb-0">
44+
<a href='{{.url}}' class="highlight-potential-sm d-inline-block ml-3">
45+
{{.title}}
46+
</a>
47+
</li>
48+
{{ range $index, $num := (seq .indent) }}
49+
</div>
50+
{{ end }}
4551
{{ end }}
4652
</ul>
4753
</div>

tina/collections/shared/page/quick_navigation.ts

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
* @type {import('tinacms').TinaField}
33
*/
44
import { TinaField } from "tinacms";
5+
import title from "./title";
56
import url from "./url";
67

78
const quick_navigation: TinaField = {
@@ -28,13 +29,15 @@ const quick_navigation: TinaField = {
2829
},
2930
},
3031
fields: [
32+
title,
33+
url,
3134
{
32-
type: "string",
33-
name: "title",
34-
label: "Title",
35-
required: true,
35+
type: "number",
36+
name: "indent",
37+
label: "Indent",
38+
description: "Indent level for the link, 0 is the default",
39+
required: false,
3640
},
37-
url,
3841
],
3942
},
4043
],

tina/tina-lock.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)