Skip to content

Commit 647c192

Browse files
committed
new report update
1 parent a7423bb commit 647c192

File tree

6 files changed

+61
-14
lines changed

6 files changed

+61
-14
lines changed

content/.DS_Store

0 Bytes
Binary file not shown.

content/english/knowledge-platform/knowledge-base/20240429_BuZa_SR.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
22
title: >-
3-
Bias assessment report Short Stay Visum – SigmaRed for the Dutch Ministry of
4-
Ministry of Foreign Affairs
3+
Bias report Short Stay Visum – SigmaRed for the Dutch Ministry of Ministry of
4+
Foreign Affairs
55
subtitle: >
6-
Bias assessment of Dutch Ministry of Ministry of Foreign Affairs' short stay
7-
visa classification model by SigmaRed Technologies. The report concludes there
8-
is no disproportionate discrimination based on age marital status or gender.
9-
However, we note that essential validation criteria are lacking in the report,
10-
which are essential to support these conclusions.
6+
Bias report requested by the Dutch Ministry of Ministry of Foreign Affairs on
7+
a short stay visa classification model by SigmaRed Technologies. The report
8+
concludes there is no disproportionate discrimination based on age, marital
9+
status or gender. However, we note that essential validation criteria are
10+
lacking in the report, which are essential to support these conclusions.
1111
image: /images/knowledge_base/BuZa_SR.png
1212
author: SigmaRed
1313
type: regular
@@ -18,9 +18,9 @@ summary: >-
1818

1919
Full report: [https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17777\&did=2024D17777](https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17777\&did=2024D17777)
2020

21-
#### Bias assessment report Short Stay Visum – SigmaRed for the Dutch Ministry of Ministry of Foreign Affairs
21+
#### Bias report Short Stay Visum – SigmaRed for the Dutch Ministry of Ministry of Foreign Affairs
2222

23-
Bias assessment report requested by the Dutch Ministry of Ministry of Foreign Affairs on the application process for short-stay visas (known as Kort Verblijf Visum, or KVV), which makes use of a rule-based classification model to categorise applicants into a fast, regular or intensive track. The goal of the study is to "detect and assess potential inter-group bias by examining the relationship between risk profile percentages and rejection rates across different demographic groups".
23+
Bias report requested by the Dutch Ministry of Ministry of Foreign Affairs on the application process for short-stay visas (known as Kort Verblijf Visum, or KVV), which makes use of a rule-based classification model to categorise applicants into a fast, regular or intensive track. The goal of the study is to "detect and assess potential inter-group bias by examining the relationship between risk profile percentages and rejection rates across different demographic groups".
2424

2525
Based on a comparative analysis of disparate impact ratios between 2022 and 2023, it is concluded that "no disproportionate discrimination based on age marital status or gender" is found\*. In the report, the rationale for excluding many bias metrics is provided. However, this explanation is absent for conditional demographic parity (CDP). Despite the common <a href="https://arxiv.org/abs/2005.05906" target="_blank">understanding</a> that CDP is suggested as an alternative to DI to mitigate Simpson's paradox, the authors do not clarify why DI is favored over CDP. Using CDP as a bias metric may result in different quantitative outcomes that might fail to support the current conclusion of the report.
2626

@@ -34,15 +34,15 @@ Moreover, the bias assessment does not evaluate the eligibility of the selection
3434
6. Marital status
3535
7. Professional.
3636

37-
Before such selection criteria can be included in a risk profile, it is imperative to justify why differentiation based on these criteria is essential, proportionate and necessary. <a href="https://publicaties.mensenrechten.nl/publicatie/61a734e65d726f72c45f9dce" target="_blank">Guidelines</a> from the Netherlands Institute on Human Rights outline this obligation. For instance, quantitative evidence supporting the inclusion of selection criteria in a risk profile could be obtained through hypothesis testing on random samples of visa applicants. It is unclear why this obvious first step in assessing bias in risk profiling is absent in the report.
37+
Before such selection criteria can be included in a risk profile, it is imperative to justify why differentiation based on these criteria is proportional, suitable and necessary. <a href="https://publicaties.mensenrechten.nl/publicatie/61a734e65d726f72c45f9dce" target="_blank">Guidelines</a> from the Netherlands Institute on Human Rights outline this obligation. For instance, quantitative evidence supporting the inclusion of selection criteria in a risk profile could be obtained through hypothesis testing on random samples of visa applicants. It is unclear why this obvious first step in assessing bias in risk profiling is absent in the report.
3838

3939
In the context of differentiation on the basis of age, the Netherlands Institute on Human Rights explains:
4040

4141
> "It is not necessarily prohibited for an algorithm to consider someone’s age. However, there must be a clear connection between age and the aim pursued. Until it is shown that someone’s age increases the likelihood \[of a rejected visum application], age is ineligible as a selection criteria in algorithmic-driven selection procedures."
4242
43-
So, it's remarkable that the assessment solely focusses on the quantitative aspects of bias and fairness and concludes no age discrimination occurs.
43+
So, it's remarkable that the assessment solely focusses on the quantitative aspects of bias testing and concludes that no age discrimination occurs.
4444

45-
In general, the socio-technological nature of algorithmic-driven decision-making processes is overlooked in this bias assessment. It is crucial to recognise that mitigating algorithmic bias requires attention of both the quantitative and qualitative reasoning paradigm, not only including numbers but also fostering organisational checks and balances, and a safe working environment that promotes open discussion about data-driven decision-making.
45+
In general, the organisational and qualitative dimension of deploying algorithmic-driven decision-making processes is not covered in this bias assessment. This is noteworthy as experts argue that both the quantitative and qualitative reasoning paradigm are needed to assess bias in algorithmic-driven decision-making. No silver quantitative bullet exist to mitigate algorithmic bias. Algorithms are designed by people and hence organizational checks and balances, including algorithm risk management frameworks, need to be reviewed to assess bias in algorithms. Given the absence of a qualitative review of the above mentioned profiling criteria 1-7, this is a weak spot of the report.
4646

4747
Lastly, instead of using advanced causal inference techniques such as inverse probability weighting (IPW) and instrument variable (IV) analysis to assess whether the rule-based classification model had a direct effect on the decisions made by officers, a preference is given to the simpler F-test.
4848

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
---
2+
title: 'Examining confirmation bias using the F-test '
3+
subtitle: >
4+
An F-test is applied to test the relationship rule-based classification model
5+
had a direct effect on the decisions made by civil servants reviewing Dutch
6+
visa applications
7+
image: /images/knowledge_base/BuZa_Ftest.png
8+
author: Onbekend
9+
type: regular
10+
summary: >-
11+
Applied to test confirmation bias of civil servant in an algorithmic-driven
12+
visa application process
13+
---
14+
15+
Full report: [https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17779\&did=2024D17779](https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17779\&did=2024D17779)
16+
17+
#### Examining confirmation bias using the F-test
18+
19+
In the short-stay visas application process of the Dutch Ministry of Ministry of Foreign Affairs (known as Kort Verblijf Visum, or KVV), a rule-based classification model to categorise applicants into a fast, regular or intensive track. The goal of this experiment is to examine "whether labeling visa applications has an effect on decisions made by civil servants reviewing the case?".
20+
21+
Based on the outcomes of a field experiment, in which 42 fictional cases are presented to civil servants, a one-sided F-test (ANOVA, fixed effects, omnibus) is applied. For significance level 5%, no evidence is found that labeling had an effect on the taken decisions.
22+
23+
This form of hypothesis testing is preferred over more advanced form of causal inference, such as inverse probability weighting (IPW) and instrument variable (IV) analysis. See also [artikel](/knowledge-platform/knowledge-base/20240429_buza_sr/).

content/nederlands/knowledge-platform/knowledge-base/20240429_BuZa_SR.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,8 @@ Voor leeftijdsdiscriminatie licht het College van de Rechten van de Mens toe:
4343
4444
Het is dus opmerkelijk dat de conclusie (dat het algoritme niet discrimineert op basis van leeftijd) enkel op kwanitatieve resultaten is gebaseerd.
4545

46-
In het algemeen geldt dat de organisatorische en kwalitatieve dimensie van onderzoek naar bias in algoritmes niet in het rapport wordt behandeld. Dit is opmerkelijk aangezien experts bij dergelijk onderzoek aandringen op multidisciplinaire blik. De consensus luidt dat de verantwoorde inzet van algoritmes niet zuiver kwantitatief beslecht kan worden. Het gaat niet alleen om de bias-maten, de werking van algoritmes blijft mensenwerk. Het onderzoeken van organisatorische omgangsvormen, rollen en verantwoordelijkheden en de werkcultuur om over lastige datamodellering-vraagstukken te spreken binnen de organisatie is daarbij van groot belang.
46+
In het algemeen geldt dat de organisatorische en kwalitatieve dimensie van onderzoek naar bias in algoritmes niet in het rapport wordt behandeld. Dit is opmerkelijk aangezien experts bij dergelijk onderzoek aandringen op multidisciplinaire blik. De consensus luidt dat de verantwoorde inzet van algoritmes niet zuiver kwantitatief beslecht kan worden. Het gaat niet alleen om de bias-maten, de werking van algoritmes blijft mensenwerk. Het onderzoeken van organisatorische omgangsvormen, rollen en verantwoordelijkheden en de werkcultuur om over lastige datamodellering-vraagstukken te spreken binnen de organisatie is daarbij van groot belang. Gegeven dat eer kwalitatieve interpretatie van de gebruikte selectiecriteria 1-7 in het risicoprofiel ontbreekt, is dit een tekortkoming in het rapport.
4747

48-
Tot slot: in het rapport worden geavanceerde methoden voor causale inferentie toegepast om het verband tussen de classificatie van het model (snel, regulier, intensief) en de beoordeling van de visumaanvraag door een ambtenaar te toetsen, onder andere door toepassing van inverse probability weighting (IPW) en instrument variable (IV) analysis. Het is onduidelijk waarom niet is gekozen voor de simpelere F-test, zoals hier toegepast op dezelfde casus.
48+
Tot slot: in het rapport worden geavanceerde methoden voor causale inferentie toegepast om het verband tussen de classificatie van het model (snel, regulier, intensief) en de beoordeling van de visumaanvraag door een ambtenaar te toetsen, onder andere door toepassing van inverse probability weighting (IPW) en instrument variable (IV) analysis. Het is onduidelijk waarom niet is gekozen voor de simpelere F-test, zoals [hier](/nl/knowledge-platform/knowledge-base/20242904_f-test_confirmation_bias/) toegepast op dezelfde casus.
4949

5050
\*voor visumaanvragers met de Jemenitische nationaliteit wordt een zekere mate van ongelijke behandeling gemeten
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
title: 'Test op confirmation bias aan de hand van F-toets '
3+
subtitle: >
4+
Een F-toetst wordt toepast om een verband te testen tussen de classificatie
5+
van een algoritme en de beoordeling van een ambtenaar in de context van
6+
snelle, reguliere en intensieve visumaanvragen
7+
image: /images/knowledge_base/BuZa_Ftest.png
8+
author: Onbekend
9+
type: regular
10+
summary: >-
11+
Toepast om een verband te testen tussen de classificatie van een algoritme en
12+
de beoordeling van een ambtenaar in de context van snelle, reguliere en
13+
intensieve visumaanvragen
14+
---
15+
16+
Volledige rapport: [https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17779\&did=2024D17779](https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17779\&did=2024D17779)
17+
18+
#### Test op confirmation bias aan de hand van F-toets
19+
20+
In de aanvraagprocedure Kort Verblijf Visum (KVV) van het Ministerie van Buitenlandse Zaken wordt een regel-gebaseerd classificatiemodel gebruikt om aanvragers in te delen in een snelle, reguliere of intensieve aanvraagprocedures. Het doel van dit onderzoek is om te bepalen "in hoeverre labeling van visumaanvragen invloed heeft op de uitkomsten van visumbeslissingen die worden genomen door beslismedewerkers?".
21+
22+
Op een veldexperiment, waarin 42 fictieve casussen aan medewerkers zijn voorgelegd, is een eenzijdige F-test (ANOVA, fixed effects, omnibus) toegepast. Voor significantieniveau 5% volgt geen bewijs dat het label geen invloed heeft gehad op de genomen beslissing.
23+
24+
De vorm van hypothesetesten geniet de voorkeur boven meer geavanceerde vormen van causale inferentie, zoals inverse probability weighting (IPW) en instrument variable (IV) analysis. Zie ook dit [artikel](/nl/knowledge-platform/knowledge-base/20240429_buza_sr/).
288 KB
Loading

0 commit comments

Comments
 (0)