You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/tartan_federer_attack/README.md
+3-7Lines changed: 3 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,18 +2,14 @@
2
2
3
3
This example runs a Tartan–Federer membership inference attack using trained TabDDPM models. The pipeline optionally performs a data processing step to prepare population datasets for tratining and validating the attack and then executes the attack using the trained classifier.
4
4
5
-
##
6
5
7
6
## Data Processing
8
7
9
-
#TODO: Train 30 target models with real data and synthetic data in the same way of the MIDST # competition and store them under attack_config.models_base_dir. Upload them to a google # drive and add the link here.
10
-
The data processing step constructs population datasets resembling the real data available to the attacker. A selected subset of `train_with_id.csv` files is collected from `tabddpm_1` to `tabddpm_6` located under:
8
+
#TODO: Train 30 target models with real data and synthetic data in the same way of the MIDST # competition Upload them to a google # drive and add the link here. Currently, we only have 6.
Download the folder from `https://drive.google.com/uc?export=download&id=12gzxNzFzKCF13IzJjZdk3Ba5XTaIrLjO` and store them under `data_paths.midst_data_path`. The data processing step constructs population datasets used for training the attacks, resembling the real data available to the attacker using the training data correspoinding to each available target model.
15
11
16
-
For each selected model, both `train_with_id.csv` and `challenge_with_id.csv` are loaded. All training datasets are merged into a single dataframe and all challenge datasets are merged into a single dataframe. Any training samples that also appear in the challenge dataset are removed, and duplicate samples are dropped based on configured identifier columns.
12
+
For each selected folder, both `train_with_id.csv` and `challenge_with_id.csv` are loaded. All training datasets are merged into a single dataframe and all challenge datasets are merged into a single dataframe. Any training samples that also appear in the challenge dataset are removed, and duplicate samples are dropped based on configured identifier columns.
17
13
18
14
The model indices used to build the population datasets for training and validation are specified in the configuration file:
0 commit comments