-
Notifications
You must be signed in to change notification settings - Fork 326
Open
Labels
pending reviewThis issue needs to be further reviewed, so work cannot be startedThis issue needs to be further reviewed, so work cannot be startedquestionGeneral question about the softwareGeneral question about the software
Description
Environment details
- CTGAN version: 0.7.1 (latest)
- Python version: 3.10.11
- Operating System: Mac/Unix
Problem description
I want to generate data conditionally, but I don't want to include the conditioned column in the output of the generator.
What I already tried
Currently, I just trim this column from the output.
Intuitively, it creates a big waste everywhere: the network is bigger (thus slower), and the model size is bigger.
Example:
Data that holds two columns: hospital name and patient's age.
Let's assume that there are 100 different hospitals, and my sole use of the generative model is to generate new rows for a given hospital.
Currently, the model will create 101 input features: 100 one-hot features (for hospital names) and one continuous feature (for age).
Metadata
Metadata
Assignees
Labels
pending reviewThis issue needs to be further reviewed, so work cannot be startedThis issue needs to be further reviewed, so work cannot be startedquestionGeneral question about the softwareGeneral question about the software