Skip to content

Add a transformer for min/max normalization #863

@npatki

Description

@npatki

Problem Description

As indicated in this issue, some users have found that applying a min/max scaling significantly improved the synthetic data quality.

However, the RDT library currently does not offer min/max scaling. It only offers the GaussianNormalizer(which uses the z-score), and ClusterBasedNormalizer which uses Bayesian GMMs.

Expected behavior

Min/max scaling will need to learn the min and max values during the fit stage. When transforming, it will take the entire distribution and transform it into the range [0,1] by using the formula: (value - min)/(max - min). Finally, the reverse transform will expand values back into the original [min, max] range, ensuring that out-of-bounds values are clipped.

Additional context

This is a tracking issue. The exact API (incl transformer name, parameters, etc.) still need to be figured out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions