You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
'The CART (Classification and Regression Trees) method generates synthetic data by learning patterns from real data through a decision tree that splits data into homogeneous groups based on feature values. It predicts averages for numerical data and assigns the most common category for categorical data, using these predictions to create new synthetic points.\n \n {{samples}} synthetic data points are generated.',
184
-
evaluationOfGeneratedDataTitle: '4. Evaluation of generated synthetic data',
184
+
evaluationOfGeneratedDataTitle:
185
+
'4. Evaluation of generated synthetic data',
185
186
distributionsTitle: '4.1 Distributions',
186
187
diagnosticsReportTitle: '4.2. Diagnostic report',
187
188
diagnosticsTitle: 'Diagnostic Results',
@@ -303,6 +304,8 @@ A two-sided t-test is performed to accept or reject <i class="font-serif">H</i><
303
304
In this example, we analyze which group is most adversely affected by the risk prediction algorithm. We do this by applying the clustering algorithm on the dataset previewed below. The column "is_recid" indicates whether a defendant reoffended or not (1: yes, 0: no). The "score_text" column indicates whether a defendant was predicted to reoffend (1: yes, 0: no). The column "false_positive" (FP) represents cases where a defendant was predicted to reoffended by the algorithm, but didn't do so (1: FP, 0: no FP). A preview of the data can be found below. The column "false_positive" is used as the bias variable.
304
305
`,
305
306
},
307
+
higherIsBetter: 'Higher value of bias variable is better',
308
+
lowerIsBetter: 'Lower value of bias variable is better',
306
309
parameters: {
307
310
heading: '2. Hyperparameters selected for clustering',
308
311
iterations: 'Number of iterations: {{value}}',
@@ -313,6 +316,7 @@ In this example, we analyze which group is most adversely affected by the risk p
313
316
- Minimal cluster size: {{minClusterSize}}
314
317
- Bias variable: {{performanceMetric}}
315
318
- Data type: {{dataType}}
319
+
- Bias variable interpretation: $t({{higherIsBetter}}) is better
0 commit comments