You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/english/knowledge-platform/knowledge-base/20240429_BuZa_SR.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,13 +1,13 @@
1
1
---
2
2
title: >-
3
-
Bias assessment report Short Stay Visum – SigmaRed for the Dutch Ministry of
4
-
Ministry of Foreign Affairs
3
+
Bias report Short Stay Visum – SigmaRed for the Dutch Ministry of Ministry of
4
+
Foreign Affairs
5
5
subtitle: >
6
-
Bias assessment of Dutch Ministry of Ministry of Foreign Affairs' short stay
7
-
visa classification model by SigmaRed Technologies. The report concludes there
8
-
is no disproportionate discrimination based on age marital status or gender.
9
-
However, we note that essential validation criteria are lacking in the report,
10
-
which are essential to support these conclusions.
6
+
Bias report requested by the Dutch Ministry of Ministry of Foreign Affairs on
7
+
a short stay visa classification model by SigmaRed Technologies. The report
8
+
concludes there is no disproportionate discrimination based on age, marital
9
+
status or gender. However, we note that essential validation criteria are
10
+
lacking in the report, which are essential to support these conclusions.
11
11
image: /images/knowledge_base/BuZa_SR.png
12
12
author: SigmaRed
13
13
type: regular
@@ -18,9 +18,9 @@ summary: >-
18
18
19
19
Full report: [https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17777\&did=2024D17777](https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17777\&did=2024D17777)
20
20
21
-
#### Bias assessment report Short Stay Visum – SigmaRed for the Dutch Ministry of Ministry of Foreign Affairs
21
+
#### Bias report Short Stay Visum – SigmaRed for the Dutch Ministry of Ministry of Foreign Affairs
22
22
23
-
Bias assessment report requested by the Dutch Ministry of Ministry of Foreign Affairs on the application process for short-stay visas (known as Kort Verblijf Visum, or KVV), which makes use of a rule-based classification model to categorise applicants into a fast, regular or intensive track. The goal of the study is to "detect and assess potential inter-group bias by examining the relationship between risk profile percentages and rejection rates across different demographic groups".
23
+
Bias report requested by the Dutch Ministry of Ministry of Foreign Affairs on the application process for short-stay visas (known as Kort Verblijf Visum, or KVV), which makes use of a rule-based classification model to categorise applicants into a fast, regular or intensive track. The goal of the study is to "detect and assess potential inter-group bias by examining the relationship between risk profile percentages and rejection rates across different demographic groups".
24
24
25
25
Based on a comparative analysis of disparate impact ratios between 2022 and 2023, it is concluded that "no disproportionate discrimination based on age marital status or gender" is found\*. In the report, the rationale for excluding many bias metrics is provided. However, this explanation is absent for conditional demographic parity (CDP). Despite the common <ahref="https://arxiv.org/abs/2005.05906"target="_blank">understanding</a> that CDP is suggested as an alternative to DI to mitigate Simpson's paradox, the authors do not clarify why DI is favored over CDP. Using CDP as a bias metric may result in different quantitative outcomes that might fail to support the current conclusion of the report.
26
26
@@ -34,15 +34,15 @@ Moreover, the bias assessment does not evaluate the eligibility of the selection
34
34
6. Marital status
35
35
7. Professional.
36
36
37
-
Before such selection criteria can be included in a risk profile, it is imperative to justify why differentiation based on these criteria is essential, proportionate and necessary. <ahref="https://publicaties.mensenrechten.nl/publicatie/61a734e65d726f72c45f9dce"target="_blank">Guidelines</a> from the Netherlands Institute on Human Rights outline this obligation. For instance, quantitative evidence supporting the inclusion of selection criteria in a risk profile could be obtained through hypothesis testing on random samples of visa applicants. It is unclear why this obvious first step in assessing bias in risk profiling is absent in the report.
37
+
Before such selection criteria can be included in a risk profile, it is imperative to justify why differentiation based on these criteria is proportional, suitable and necessary. <ahref="https://publicaties.mensenrechten.nl/publicatie/61a734e65d726f72c45f9dce"target="_blank">Guidelines</a> from the Netherlands Institute on Human Rights outline this obligation. For instance, quantitative evidence supporting the inclusion of selection criteria in a risk profile could be obtained through hypothesis testing on random samples of visa applicants. It is unclear why this obvious first step in assessing bias in risk profiling is absent in the report.
38
38
39
39
In the context of differentiation on the basis of age, the Netherlands Institute on Human Rights explains:
40
40
41
41
> "It is not necessarily prohibited for an algorithm to consider someone’s age. However, there must be a clear connection between age and the aim pursued. Until it is shown that someone’s age increases the likelihood \[of a rejected visum application], age is ineligible as a selection criteria in algorithmic-driven selection procedures."
42
42
43
-
So, it's remarkable that the assessment solely focusses on the quantitative aspects of bias and fairness and concludes no age discrimination occurs.
43
+
So, it's remarkable that the assessment solely focusses on the quantitative aspects of bias testing and concludes that no age discrimination occurs.
44
44
45
-
In general, the socio-technological nature of algorithmic-driven decision-making processes is overlooked in this bias assessment. It is crucial to recognise that mitigating algorithmic bias requires attention of both the quantitative and qualitative reasoning paradigm, not only including numbers but also fostering organisational checks and balances, and a safe working environment that promotes open discussion about data-driven decision-making.
45
+
In general, the organisational and qualitative dimension of deploying algorithmic-driven decision-making processes is not covered in this bias assessment. This is noteworthy as experts argue that both the quantitative and qualitative reasoning paradigm are needed to assess bias in algorithmic-driven decision-making. No silver quantitative bullet exist to mitigate algorithmic bias. Algorithms are designed by people and hence organizational checks and balances, including algorithm risk management frameworks, need to be reviewed to assess bias in algorithms. Given the absence of a qualitative review of the above mentioned profiling criteria 1-7, this is a weak spot of the report.
46
46
47
47
Lastly, instead of using advanced causal inference techniques such as inverse probability weighting (IPW) and instrument variable (IV) analysis to assess whether the rule-based classification model had a direct effect on the decisions made by officers, a preference is given to the simpler F-test.
title: 'Examining confirmation bias using the F-test '
3
+
subtitle: >
4
+
An F-test is applied to test the relationship rule-based classification model
5
+
had a direct effect on the decisions made by civil servants reviewing Dutch
6
+
visa applications
7
+
image: /images/knowledge_base/BuZa_Ftest.png
8
+
author: Onbekend
9
+
type: regular
10
+
summary: >-
11
+
Applied to test confirmation bias of civil servant in an algorithmic-driven
12
+
visa application process
13
+
---
14
+
15
+
Full report: [https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17779\&did=2024D17779](https://www.tweedekamer.nl/kamerstukken/detail?id=2024D17779\&did=2024D17779)
16
+
17
+
#### Examining confirmation bias using the F-test
18
+
19
+
In the short-stay visas application process of the Dutch Ministry of Ministry of Foreign Affairs (known as Kort Verblijf Visum, or KVV), a rule-based classification model to categorise applicants into a fast, regular or intensive track. The goal of this experiment is to examine "whether labeling visa applications has an effect on decisions made by civil servants reviewing the case?".
20
+
21
+
Based on the outcomes of a field experiment, in which 42 fictional cases are presented to civil servants, a one-sided F-test (ANOVA, fixed effects, omnibus) is applied. For significance level 5%, no evidence is found that labeling had an effect on the taken decisions.
22
+
23
+
This form of hypothesis testing is preferred over more advanced form of causal inference, such as inverse probability weighting (IPW) and instrument variable (IV) analysis. See also [artikel](/knowledge-platform/knowledge-base/20240429_buza_sr/).
Copy file name to clipboardExpand all lines: content/nederlands/knowledge-platform/knowledge-base/20240429_BuZa_SR.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,8 +43,8 @@ Voor leeftijdsdiscriminatie licht het College van de Rechten van de Mens toe:
43
43
44
44
Het is dus opmerkelijk dat de conclusie (dat het algoritme niet discrimineert op basis van leeftijd) enkel op kwanitatieve resultaten is gebaseerd.
45
45
46
-
In het algemeen geldt dat de organisatorische en kwalitatieve dimensie van onderzoek naar bias in algoritmes niet in het rapport wordt behandeld. Dit is opmerkelijk aangezien experts bij dergelijk onderzoek aandringen op multidisciplinaire blik. De consensus luidt dat de verantwoorde inzet van algoritmes niet zuiver kwantitatief beslecht kan worden. Het gaat niet alleen om de bias-maten, de werking van algoritmes blijft mensenwerk. Het onderzoeken van organisatorische omgangsvormen, rollen en verantwoordelijkheden en de werkcultuur om over lastige datamodellering-vraagstukken te spreken binnen de organisatie is daarbij van groot belang.
46
+
In het algemeen geldt dat de organisatorische en kwalitatieve dimensie van onderzoek naar bias in algoritmes niet in het rapport wordt behandeld. Dit is opmerkelijk aangezien experts bij dergelijk onderzoek aandringen op multidisciplinaire blik. De consensus luidt dat de verantwoorde inzet van algoritmes niet zuiver kwantitatief beslecht kan worden. Het gaat niet alleen om de bias-maten, de werking van algoritmes blijft mensenwerk. Het onderzoeken van organisatorische omgangsvormen, rollen en verantwoordelijkheden en de werkcultuur om over lastige datamodellering-vraagstukken te spreken binnen de organisatie is daarbij van groot belang. Gegeven dat eer kwalitatieve interpretatie van de gebruikte selectiecriteria 1-7 in het risicoprofiel ontbreekt, is dit een tekortkoming in het rapport.
47
47
48
-
Tot slot: in het rapport worden geavanceerde methoden voor causale inferentie toegepast om het verband tussen de classificatie van het model (snel, regulier, intensief) en de beoordeling van de visumaanvraag door een ambtenaar te toetsen, onder andere door toepassing van inverse probability weighting (IPW) en instrument variable (IV) analysis. Het is onduidelijk waarom niet is gekozen voor de simpelere F-test, zoals hier toegepast op dezelfde casus.
48
+
Tot slot: in het rapport worden geavanceerde methoden voor causale inferentie toegepast om het verband tussen de classificatie van het model (snel, regulier, intensief) en de beoordeling van de visumaanvraag door een ambtenaar te toetsen, onder andere door toepassing van inverse probability weighting (IPW) en instrument variable (IV) analysis. Het is onduidelijk waarom niet is gekozen voor de simpelere F-test, zoals [hier](/nl/knowledge-platform/knowledge-base/20242904_f-test_confirmation_bias/) toegepast op dezelfde casus.
49
49
50
50
\*voor visumaanvragers met de Jemenitische nationaliteit wordt een zekere mate van ongelijke behandeling gemeten
#### Test op confirmation bias aan de hand van F-toets
19
+
20
+
In de aanvraagprocedure Kort Verblijf Visum (KVV) van het Ministerie van Buitenlandse Zaken wordt een regel-gebaseerd classificatiemodel gebruikt om aanvragers in te delen in een snelle, reguliere of intensieve aanvraagprocedures. Het doel van dit onderzoek is om te bepalen "in hoeverre labeling van visumaanvragen invloed heeft op de uitkomsten van visumbeslissingen die worden genomen door beslismedewerkers?".
21
+
22
+
Op een veldexperiment, waarin 42 fictieve casussen aan medewerkers zijn voorgelegd, is een eenzijdige F-test (ANOVA, fixed effects, omnibus) toegepast. Voor significantieniveau 5% volgt geen bewijs dat het label geen invloed heeft gehad op de genomen beslissing.
23
+
24
+
De vorm van hypothesetesten geniet de voorkeur boven meer geavanceerde vormen van causale inferentie, zoals inverse probability weighting (IPW) en instrument variable (IV) analysis. Zie ook dit [artikel](/nl/knowledge-platform/knowledge-base/20240429_buza_sr/).
0 commit comments