Skip to content

Allow the CardinalityShapeSimilarity metric to work with composite keys #839

@npatki

Description

@npatki

Problem Description

In sdv-dev/SDV#2778, we are allowing for composite keys to be specified in the metadata. Although SDV Community won't be able to model composite keys, SDV Enterprise will have this capability. Therefore, a user might have real and synthetic data with composite keys.

In this case, the CardinalityShapeSimilarity metric should be able to work with composite keys.

Expected behavior

This metric currently just takes in metadata and reports the cardinality shape similarity for each relationship in the breakdown. So this end-user API doesn't need to change.

from sdmetrics.multi_table import CardinalityShapeSimilarity

CardinalityShapeSimilarity.compute_breakdown(
    real_data={
      'user': real_user_table,
      'sessions': real_sessions_table,
      'transactions': real_transactions_table
    },
    synthetic_data={
      'users': synthetic_user_table,
      'sessions': real_sessions_table,
      'transactions': real_transactions_table
    },
    metadata=multi_table_metadata_dict
)

However internally, we should make sure that the metric is computed based on all the columns involved in a primary or foreign key.

Additional context

This is a blocker for supporting composite keys in the reports. See #835.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions