Skip to content

Commit c17914a

Browse files
neuralsorcerermeta-codesync[bot]
authored andcommitted
Add distribution diagnostics for BalanceDF (#265)
Summary: - Added weighted EMD/CVMD/KS computation helpers and comparison functions in the weighted stats module. - Exposed EMD/CVMD/KS BalanceDF helper methods and public comparison APIs for linked samples and direct targets. - Added appropiate tests for EMD/CVMD/KS covering identical distributions, weighted effects, expected discrete/numeric values, validation errors, and NA-indicator skipping. Pull Request resolved: #265 Differential Revision: D90854392 Pulled By: talgalili fbshipit-source-id: 6a29d41d960131b58ade6ee139de0466fc8d0b0d
1 parent cd507a1 commit c17914a

File tree

7 files changed

+2322
-1027
lines changed

7 files changed

+2322
-1027
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@
2727
- `descriptive_stats()` now accepts a `formula` argument that is always
2828
applied to the data (including numeric-only frames), letting callers
2929
control which terms and dummy variables are included in summary statistics.
30+
- **Added EMD/CVMD/KS distribution diagnostics**
31+
- `BalanceDF` now exposes Earth Mover's Distance (EMD), Cramér-von Mises distance (CVMD), and Kolmogorov-Smirnov (KS) statistics for comparing adjusted samples to targets.
32+
- These diagnostics support weighted or unweighted comparisons, apply discrete/continuous formulations, and respect `aggregate_by_main_covar` for one-hot categorical aggregation.
3033

3134
## Bug Fixes
3235

0 commit comments

Comments
 (0)