You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
'The CART (Classification and Regression Trees) method generates synthetic data by learning patterns from real data through a decision tree that splits data into homogeneous groups based on feature values. It predicts averages for numerical data and assigns the most common category for categorical data, using these predictions to create new synthetic points.\n \n {{samples}} synthetic data points are generated.',
184
-
evaluationOfGeneratedDataTitle: '4. Evaluation of generated synthetic data',
184
+
evaluationOfGeneratedDataTitle:
185
+
'4. Evaluation of generated synthetic data',
185
186
distributionsTitle: '4.1 Distributions',
186
187
diagnosticsReportTitle: '4.2. Diagnostic report',
187
188
diagnosticsTitle: 'Diagnostic Results',
@@ -274,11 +275,11 @@ missing data are imputed. For {tooltip:syntheticData.missingDataMCARTooltip}Miss
274
275
- <i class="font-serif">H</i><sub>0</sub>: no difference in bias variable between the most deviating cluster and the rest of the dataset
275
276
- <i class="font-serif">H</i><sub>1</sub>: difference in bias variable between the most deviating cluster and the rest of the dataset
276
277
277
-
A two-sided t-test is performed to accept or reject <i class="font-serif">H</i><sub>0</sub>:.
278
+
A two-sided t-test is performed to accept or reject <i class="font-serif">H</i><sub>0</sub>:
p_valueTooltip: `The p-value represents the probability of incorrectly rejecting the null hypothesis (H<sub>0</sub>) when it is actually true. A commonly used threshold is p≤0.05, which is the probability deemed sufficiently low to reject H<sub>0</sub> in favor of the alternative hypothesis (H<sub>1</sub>).`,
282
283
dataSetPreview: {
283
284
heading: '1. Preview of data',
284
285
},
@@ -303,6 +304,8 @@ A two-sided t-test is performed to accept or reject <i class="font-serif">H</i><
303
304
In this example, we analyze which group is most adversely affected by the risk prediction algorithm. We do this by applying the clustering algorithm on the dataset previewed below. The column "is_recid" indicates whether a defendant reoffended or not (1: yes, 0: no). The "score_text" column indicates whether a defendant was predicted to reoffend (1: yes, 0: no). The column "false_positive" (FP) represents cases where a defendant was predicted to reoffended by the algorithm, but didn't do so (1: FP, 0: no FP). A preview of the data can be found below. The column "false_positive" is used as the bias variable.
304
305
`,
305
306
},
307
+
higherIsBetter: 'Higher value of bias variable is better',
308
+
lowerIsBetter: 'Lower value of bias variable is better',
306
309
parameters: {
307
310
heading: '2. Hyperparameters selected for clustering',
308
311
iterations: 'Number of iterations: {{value}}',
@@ -313,6 +316,7 @@ In this example, we analyze which group is most adversely affected by the risk p
p_valueTooltip: `De p-waarde is de kans om de nulhypothese (H<sub>0</sub>) onterecht te verwerpen wanneer deze in werkelijkheid waar is. Een veelgebruikte drempelwaarde is p≤0,05, wat wordt beschouwd als een voldoende lage kans om H<sub>0</sub> te verwerpen en de alternatieve hypothese (H<sub>1</sub>) te accepteren.`,
307
+
higherIsBetter: 'Hogere waarde van bias variabele is beter',
308
+
lowerIsBetter: 'Lagere waarde van bias variabele is beter',
307
309
parameters: {
308
310
heading: '2. Geselecteerde hyperparameters',
309
311
iterations: 'Aantal iteraties: {{value}}',
@@ -316,6 +318,7 @@ Er wordt een tweezijdige t-toets uitgevoerd om <i class="font-serif">H</i><sub>0
316
318
- Minimale clustergrootte: {{minClusterSize}}
317
319
- Bias variabele: {{performanceMetric}}
318
320
- Gegevenstype: {{dataType}}
321
+
- Interpretatie van bias variabele: $t({{higherIsBetter}}) is better
319
322
`,
320
323
},
321
324
distribution: {
@@ -365,7 +368,8 @@ Er wordt een tweezijdige t-toets uitgevoerd om <i class="font-serif">H</i><sub>0
365
368
- Aantal gevonden clusters: {{clusterCount}}
366
369
`,
367
370
label: 'Kies cluster om het aantal datapunten voor weer te geven',
368
-
valueText: 'Aantal datapunten in cluster {{index}}: {{value}}',
371
+
valueText:
372
+
'Aantal datapunten in cluster {{index}}: {{value}} / {{totalRecords}}',
369
373
},
370
374
higherAverage: `Het meest afwijkende cluster heeft statistisch significant andere bias variabele dan de rest van de dataset.`,
371
375
noSignificance: `Het meest afwijkende cluster heeft statistisch significant geen andere bias variabele dan de rest van de dataset.`,
0 commit comments