Possible performance degradation for high cardinality columns in Contingency Similarity (affecting Quality Report)

### Environment Details
* SDMetrics version: 0.14.1

### Error Description
In the Quality Report, the Column Pair Trends and Intertable Trends properties both use the [ContingencySimilarity](https://docs.sdv.dev/sdmetrics/metrics/metrics-glossary/contingencysimilarity) metric to compute a score. 

This underlying metric's performance may not be optimized when a column has extremely high cardinality. If you are computing between two columns A and B, then this metric computes the cross-tabulation of the two columns based on cardinality. Eg: If Column A is categorical with cardinality of `a`, and column B is also categorical with cardinality of `b`, then the Contingency Table will contain `a x b` values. This may end up being slow if `a` or `b` is really large.

### Additional Context
We are not interested in replacing `ContingencySimilarity` with another metric. Rather, we should optimize its performance. Some ideas include:

- looking at the base operations for cross tabulation and figuring out if there are any faster ones
- taking a random subset 
- considering the top _n_ most frequently occurring categories for the cross tabulation (where "top n" is calculated based on only the real data and the same exact set of n categories is used for the synthetic data)
- etc.

Any solution will have to be vetted to ensure that the overall quality score being returned does not differ too much from the status quo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible performance degradation for high cardinality columns in Contingency Similarity (affecting Quality Report) #589

Environment Details

Error Description

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible performance degradation for high cardinality columns in Contingency Similarity (affecting Quality Report) #589

Description

Environment Details

Error Description

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions