Skip to content

feature request: Mask sensitive columns #409

@stephenpardy

Description

@stephenpardy

Issue

Users of datacompy sometimes have sensitive columns in their data (such as account IDs or other join keys). The comparison report will display these columns as-is leading to potential leakage of this information if not handled correctly. Users currently need to mask the sensitive information either before using datacompy or before sending the report.

Solution

Allow users to pass in a list of column names and mask those column values before outputing the comparison report, e.g.:

| ACCOUNT_ID | BALANCE |
| 123 | 100.00 |
| 456 | 200.00 |
| 789 | 50.00 |

Becomes:

| ACCOUNT_ID | BALANCE |
| ***** | 100.00 |
| ***** | 200.00 |
| ***** | 50.00 |

Alternatives

An alternative to masking is to hash values using a secure hashing algorithm before the performing the comparison. Values that match will be hashed to the same hash value.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions