You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/locales/en.ts
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -66,9 +66,9 @@ export const en = {
66
66
demo: {
67
67
heading: 'Information about demo dataset',
68
68
description:
69
-
'A subset of the [Law School Admission Bar](https://www.kaggle.com/datasets/danofer/law-school-admissions-bar-passage)* dataset is used as a demo. Synthetic data will be generated for the following variables:\n \n \n',
69
+
'A subset of the [Law School Admission Bar](https://www.kaggle.com/datasets/danofer/law-school-admissions-bar-passage)* dataset is used as a demo. Synthetic data will be generated for the following variables:',
70
70
'post.description':
71
-
'The CART method is used to generate the synthetic data. CART generally produces high quality synthetic data, but might not work well on datasets with categorical variables with 20+ categories. Use Gaussian Copula in those cases.\n \n \n\n*The original paper can be found [here](https://files.eric.ed.gov/fulltext/ED469370.pdf)\n \n \n',
71
+
'The CART method is used to generate the synthetic data. CART generally produces high quality synthetic data, but might not work well on datasets with categorical variables with 20+ categories. Use Gaussian Copula in those cases.\n \n*The original paper can be found [here](https://files.eric.ed.gov/fulltext/ED469370.pdf)',
72
72
'data.column.Variable_name': 'Variable name',
73
73
'data.sex': 'sex',
74
74
'data.race1': 'race1',
@@ -158,24 +158,24 @@ export const en = {
158
158
bivariateText:
159
159
'The figures below display the differences in value frequency for a combination of variables. For comparing two categorical variables, bar charts are plotted. For comparing a numerical and a categorical variables, a so called [violin plot](https://en.wikipedia.org/wiki/Violin_plot) is shown. For comparing two numercial variables, a [LOESS plot](https://en.wikipedia.org/wiki/Local_regression) is created. For all plots holds: the synthetic data is of high quality when the shape of the distributions in the synthetic data equal the distributions in the real data.',
160
160
moreInfo:
161
-
' \n \n \n \nDo you want to learn more about synthetic data?\n \n \n \n- [python-synthpop on Github](https://github.com/NGO-Algorithm-Audit/python-synthpop)\n- [local-first web app on Github](https://github.com/NGO-Algorithm-Audit/local-first-web-tool/tree/main)\n- [Synthetic Data: what, why and how?](https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies/Synthetic_Data_Survey-24.pdf)\n- [Knowledge Network Synthetic Data](https://online.rijksinnovatiecommunity.nl/groups/399-kennisnetwerk-synthetischedata/welcome) (for Dutch public organizations)\n- [Synthetic data portal of Dutch Executive Agency for Education](https://duo.nl/open_onderwijsdata/footer/synthetische-data.jsp) (DUO)\n- [CART: synthpop resources](https://synthpop.org.uk/resources.html)\n- [Gaussian Copula - Synthetic Data Vault](https://docs.sdv.dev/sdv)',
161
+
'Do you want to learn more about synthetic data?\n \n \n \n- [python-synthpop on Github](https://github.com/NGO-Algorithm-Audit/python-synthpop)\n- [local-first web app on Github](https://github.com/NGO-Algorithm-Audit/local-first-web-tool/tree/main)\n- [Synthetic Data: what, why and how?](https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies/Synthetic_Data_Survey-24.pdf)\n- [Knowledge Network Synthetic Data](https://online.rijksinnovatiecommunity.nl/groups/399-kennisnetwerk-synthetischedata/welcome) (for Dutch public organizations)\n- [Synthetic data portal of Dutch Executive Agency for Education](https://duo.nl/open_onderwijsdata/footer/synthetische-data.jsp) (DUO)\n- [CART: synthpop resources](https://synthpop.org.uk/resources.html)\n- [Gaussian Copula - Synthetic Data Vault](https://docs.sdv.dev/sdv)',
162
162
missingData: `For Missing At Random (MAR) and Missing Not At Random (MNAR) data,
163
163
we recommend to impute the missing data. For Missing Completely At Random (MCAR), we recommend to remove the missing data. See the info box for more information. {tooltip:syntheticData.missingDataTooltip}More info about MCAR, MAR, and MNAR{/tooltip}`,
164
164
missingDataTooltip: `MCAR, MAR, and MNAR are terms used to describe different mechanisms of missing data:
165
165
166
-
1. **MCAR (Missing Completely At Random)**:
166
+
**1. MCAR (Missing Completely At Random)**:
167
167
- The probability of data being missing is completely independent of both observed and unobserved data.
168
168
- There is no systematic pattern to the missingness.
169
169
- Example: A survey respondent accidentally skips a question due to a printing error.
170
170
- Recommendation: remove missing data.
171
171
172
-
2. **MAR (Missing At Random)**:
172
+
**2. MAR (Missing At Random)**:
173
173
- The probability of data being missing is related to the observed data but not the missing data itself.
174
174
- The missingness can be predicted by other variables in the dataset.
175
175
- Example: Students' test scores are missing, but the missingness is related to their attendance records.
176
176
- Recommendation: impute missing data.
177
177
178
-
3. **MNAR (Missing Not At Random)**:
178
+
**3. MNAR (Missing Not At Random)**:
179
179
- The probability of data being missing is related to the missing data itself.
180
180
- There is a systematic pattern to the missingness that is related to the unobserved data.
181
181
- Example: Patients with more severe symptoms are less likely to report their symptoms, leading to missing data that is related to the severity of the symptoms.
0 commit comments