Update readme

jfparie · jfparie · commit f3e540b21fbd · 2024-12-20T13:53:46.000+01:00
diff --git a/README.md b/README.md
@@ -1,6 +1,8 @@
-# Synthpop
+![image](https://github.com/NGO-Algorithm-Audit/python-synthpop/blob/main/images/header.png)
 
-Python implementation of the R package synthpop.
+# python-synthpop
+
+Python implementation of the R package [synthpop](https://cran.r-project.org/web/packages/synthpop/index.html).
 
 The R implementation of synthpop is a tool for producing synthetic versions of microdata containing confidential information so that they are safe to be released to users for exploratory analysis. The key objective of generating synthetic data is to replace sensitive original values with synthetic ones causing minimal distortion of the statistical information contained in the dataset. Variables, which can be categorical or continuous, are synthesised one-by-one using sequential modelling. Replacements are generated by drawing from conditional distributions fitted to the original data using parametric or classification and regression trees models.
 
@@ -11,24 +13,24 @@ This project is in Alpha status and the roadmap can be found here.
 
 # Installation
 
-Pip
+#### Pip
 
 ```
-pip install py-synthpop
+pip install python-synthpop
 ```
 
-Source
+#### Source
 
 ```
-git clone <url>
-cd synthpop
+git clone https://github.com/NGO-Algorithm-Audit/python-synthpop.git
+cd python-synthpop
 pip install -r requirements.txt
 python setup.py install
 ```
 
 # Examples
 
-Adult dataset
+#### Adult dataset
 We will use the US adult census dataset, which is a freely available open dataset extracted from the US census bureau database. The dataset is initially designed for a binary classification problem and the task is to predict whether a person earns over $50,000 a year. The dataset is a mixture of discrete and continuous features, including age, working status (workclass), education, marital status, race, sex, relationship and hours worked per week.
 
 ```
diff --git a/images/Header.png b/images/Header.png
diff --git a/requirements.txt b/requirements.txt
@@ -1,4 +1,4 @@
 numpy>=1.20.0
 pandas>=1.3.0
 scikit-learn>=1.0.0
-pytest>=7.0.0
+pytest>=7.0.0