Skip to content

Commit 5397346

Browse files
committed
readme update
1 parent 9efedd7 commit 5397346

File tree

1 file changed

+3
-4
lines changed

1 file changed

+3
-4
lines changed

README.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,11 @@
44

55
Python implementation of the R package [synthpop](https://cran.r-project.org/web/packages/synthpop/index.html).
66

7-
This library produces synthetic versions of tabular data containing confidential information so that they are safe to be released to users for exploratory analysis. The key objective of generating synthetic data is to replace sensitive original values with synthetic ones causing minimal distortion of the statistical information contained in the dataset. Variables, which can be categorical or continuous, are synthesised one-by-one using sequential modelling. Replacements are generated by drawing from conditional distributions fitted to the original data using parametric (e.g., Gaussian copula) or classification and regression trees (CART) models.
7+
With this library synthetic tabular data can be produced. Synthetic data refers to artificially generated data that mimics real-world data in structure and statistical properties but does not directly originate from actual events or individuals. It supports processing numerical and categorical data using sequential modelling techniques. Artificial data are generated by drawing from conditional distributions fitted to the original data using parametric (e.g., Gaussian copula) or classification and regression trees (CART) models.
88

9-
This is a reimplementation in Python which allows synthetic data to be generated via the method .generate() after the algorithm had been fit to the original data via the method .fit(). The process can be largely automated, if default settings are used, or with methods defined by the user. Optional parameters can be used to influence the disclosure risk and the analytical quality of the synthetic data.
9+
This Python library is a reimplementation of the R package `synthpop`. Synthetic data can be generated using the `.generate()` method after fitting the a synntesizer to the original data with the `.fit()` method. The process can be largely automated using default settings or customized through user-defined settings. Optional parameters can be used to influence the disclosure risk and the analytical quality of the synthetic data.
1010

11-
Development status and roadmap
12-
This project is in Alpha status and the roadmap can be found here.
11+
☁️ [Web app](https://local-first-bias-detection.s3.eu-central-1.amazonaws.com/synthetic-data.html) – a demo of the synthetic data generation using the `python-synthpop`
1312

1413
# Installation
1514

0 commit comments

Comments
 (0)