Skip to content

Commit 7770f9a

Browse files
committed
bias variable instead of outcome label
1 parent f029458 commit 7770f9a

File tree

1 file changed

+18
-18
lines changed

1 file changed

+18
-18
lines changed

src/locales/en.ts

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -34,14 +34,14 @@ export const en = {
3434
dataSet: 'Dataset',
3535
dataSetTooltip: `Preprocess your data such that:
3636
- missing values are removed or replaced;
37-
- all columns (except your outcome label column) should have the same datatypes, e.g., numerical or categorical;
38-
- the outcome label column is numerical`,
39-
performanceMetric: 'Outcome label',
37+
- all columns (except your bias variable column) should have the same datatypes, e.g., numerical or categorical;
38+
- the bias variable column is numerical`,
39+
performanceMetric: 'Bias variable',
4040
performanceMetricTooltip:
41-
'Clustering will be performed on the outcome labels. The outcome label should be numerical. Examples of outcome labels are "being classified as high risk" or "selected for an investigation"',
41+
'Clustering will be performed on the bias variable. The bias variable should be numerical. Examples of bias variables are "being classified as high risk" or "selected for an investigation"',
4242
dataType: 'Type of data',
4343
dataTypeTooltip:
44-
'Specify whether the data are categorical or numerical. All columns (except your outcome label column) should have the same data type',
44+
'Specify whether the data are categorical or numerical. All columns (except your bias variable column) should have the same data type',
4545
categoricalData: 'Categorical data',
4646
numericalData: 'Numerical data',
4747
filterSelect:
@@ -52,11 +52,11 @@ export const en = {
5252
iterations: 'Iterations',
5353
minClusterSize: 'Minimal cluster size',
5454
performanceInterpretation: {
55-
title: 'Outcome label interpretation',
56-
lower: 'Lower value of outcome label is better, such as error rate',
57-
higher: 'Higher value of outcome label is better, such as accuracy',
55+
title: 'Bias variable interpretation',
56+
lower: 'Lower value of bias variable is better, such as error rate',
57+
higher: 'Higher value of bias variable is better, such as accuracy',
5858
tooltip:
59-
'When error rate or misclassifications are chosen as the outcome label, a lower value is preferred, as the goal is to minimize errors. Conversely, when accuracy or precision is selected as the outcome label, a higher value is preferred, reflecting the aim to maximize performance. Selected for an investigation or a false positive is consiered as disadvantageous, so for this outcome label a lower value is preferred',
59+
'When error rate or misclassifications are chosen as the bias variable, a lower value is preferred, as the goal is to minimize errors. Conversely, when accuracy or precision is selected as the bias variable, a higher value is preferred, reflecting the aim to maximize performance. Selected for an investigation or a false positive is consiered as disadvantageous, so for this bias variable a lower value is preferred',
6060
},
6161
iterationsTooltip:
6262
'Number of times the dataset is split in smaller clusters. Can terminate early if the minimum cluster size is reached',
@@ -66,7 +66,7 @@ export const en = {
6666
},
6767
errors: {
6868
csvRequired: 'Please upload a csv file.',
69-
targetColumnRequired: 'Please select a outcome label.',
69+
targetColumnRequired: 'Please select a bias variable.',
7070
dataTypeRequired: 'Please select a data type.',
7171
noNumericColumns:
7272
'No numeric columns found. Please upload a valid dataset.',
@@ -269,10 +269,10 @@ missing data are imputed. For {tooltip:syntheticData.missingDataMCARTooltip}Miss
269269
},
270270

271271
biasAnalysis: {
272-
testingStatisticalSignificance: `**5. Testing cluster differences wrt. outcome labels**
272+
testingStatisticalSignificance: `**5. Testing cluster differences wrt. bias variable**
273273
274-
- <i class="font-serif">H</i><sub>0</sub>: no difference in outcome labels between the most deviating cluster and the rest of the dataset
275-
- <i class="font-serif">H</i><sub>1</sub>: difference in outcome labels between the most deviating cluster and the rest of the dataset
274+
- <i class="font-serif">H</i><sub>0</sub>: no difference in bias variable between the most deviating cluster and the rest of the dataset
275+
- <i class="font-serif">H</i><sub>1</sub>: difference in bias variable between the most deviating cluster and the rest of the dataset
276276
277277
A two-sided t-test is performed to accept or reject <i class="font-serif">H</i><sub>0</sub>:.
278278
@@ -300,18 +300,18 @@ A two-sided t-test is performed to accept or reject <i class="font-serif">H</i><
300300
301301
<br>
302302
303-
In this example, we analyze which group is most adversely affected by the risk prediction algorithm. We do this by applying the clustering algorithm on the dataset previewed below. The column "is_recid" indicates whether a defendant reoffended or not (1: yes, 0: no). The "score_text" column indicates whether a defendant was predicted to reoffend (1: yes, 0: no). The column "false_positive" (FP) represents cases where a defendant was predicted to reoffended by the algorithm, but didn't do so (1: FP, 0: no FP). A preview of the data can be found below. The column "false_positive" is used as the outcome label.
303+
In this example, we analyze which group is most adversely affected by the risk prediction algorithm. We do this by applying the clustering algorithm on the dataset previewed below. The column "is_recid" indicates whether a defendant reoffended or not (1: yes, 0: no). The "score_text" column indicates whether a defendant was predicted to reoffend (1: yes, 0: no). The column "false_positive" (FP) represents cases where a defendant was predicted to reoffended by the algorithm, but didn't do so (1: FP, 0: no FP). A preview of the data can be found below. The column "false_positive" is used as the bias variable.
304304
`,
305305
},
306306
parameters: {
307307
heading: '2. Hyperparameters selected for clustering',
308308
iterations: 'Number of iterations: {{value}}',
309309
minClusterSize: 'Minimal cluster size: {{value}}',
310-
performanceMetric: 'Outcome label: {{value}}',
310+
performanceMetric: 'Bias variable: {{value}}',
311311
dataType: 'Data type: {{value}}',
312312
description: `- Number of iterations: {{iterations}}
313313
- Minimal cluster size: {{minClusterSize}}
314-
- Outcome label: {{performanceMetric}}
314+
- Bias variable: {{performanceMetric}}
315315
- Data type: {{dataType}}
316316
`,
317317
},
@@ -368,8 +368,8 @@ In this example, we analyze which group is most adversely affected by the risk p
368368
label: 'Choose cluster to show number of datapoints for',
369369
valueText: 'Number of datapoints in cluster {{index}}: {{value}}',
370370
},
371-
higherAverage: `The most deviating cluster has statistically significant different outcome labels than the rest of the dataset.`,
372-
noSignificance: `No statistically significant difference in outcome labels between the most biased cluster and the rest of the dataset.`,
371+
higherAverage: `The most deviating cluster has statistically significant different bias variable than the rest of the dataset.`,
372+
noSignificance: `No statistically significant difference in bias variable between the most biased cluster and the rest of the dataset.`,
373373

374374
conclusion: `7. Conclusion and bias report`,
375375
conclusionDescription: `From the above figures and statistical tests, it can be concluded that:`,

0 commit comments

Comments
 (0)