Commit c7fcb79
Extract common pattern in comparison stat functions to reduce CCN (#267)
Summary:
- [x] Create a generic helper function `_apply_comparison_stat_to_BalanceDF` to reduce code duplication
- [x] Update `_asmd_BalanceDF` to use the new helper function
- [x] Update `_kld_BalanceDF` to use the new helper function
- [x] Update `_emd_BalanceDF` to use the new helper function
- [x] Update `_cvmd_BalanceDF` to use the new helper function
- [x] Update `_ks_BalanceDF` to use the new helper function
- [x] Run tests to ensure all functionality remains unchanged (all 83 tests in test_balancedf.py pass)
- [x] Run code review and security checks (no issues found)
- [x] Add direct test coverage for _kld_BalanceDF, _emd_BalanceDF, _cvmd_BalanceDF, and _ks_BalanceDF (5 new tests added)
- [x] Fix flake8 linting errors (removed trailing whitespace from blank lines)
- [x] Fix ufmt formatting errors (formatted test file with ufmt)
Successfully extracted the common pattern from five similar functions (`_asmd_BalanceDF`, `_kld_BalanceDF`, `_emd_BalanceDF`, `_cvmd_BalanceDF`, `_ks_BalanceDF`) into a single generic helper function `_apply_comparison_stat_to_BalanceDF`.
### Changes Made
- Added `Callable` to the imports from `typing` module
- Created `_apply_comparison_stat_to_BalanceDF` helper function that:
1. Validates inputs are BalanceDF objects
2. Extracts df and weights from both objects
3. Calls the comparison function with the extracted data
- Refactored all five comparison functions to use the helper (reduced from ~30 lines to ~5 lines each)
- Maintains special handling for `_asmd_BalanceDF` which passes `std_type="target"` via kwargs
- **Added comprehensive test coverage** for all comparison methods:
- `test_BalanceDF__kld_BalanceDF`: Direct test of _kld_BalanceDF method
- `test_BalanceDF__emd_BalanceDF`: Direct test of _emd_BalanceDF method
- `test_BalanceDF__cvmd_BalanceDF`: Direct test of _cvmd_BalanceDF method
- `test_BalanceDF__ks_BalanceDF`: Direct test of _ks_BalanceDF method
- `test_BalanceDF_comparison_functions_invalid_input`: Tests input validation for all methods
- **Fixed flake8 linting errors**: Removed trailing whitespace from blank lines in test file
- **Fixed ufmt formatting**: Formatted test file according to project standards (black + usort)
### Test Coverage
- All tests now directly exercise the helper function through the four comparison methods
- Tests verify correct Series output with expected keys (a, b, mean(metric))
- Tests verify mathematical properties (non-negativity, bounded ranges)
- Tests verify aggregate_by_main_covar parameter works
- Tests verify proper input validation with clear error messages
- Total tests: 88 (83 original + 5 new), all passing
- **Code quality compliance**: All linting (flake8) and formatting (ufmt) checks pass
### Benefits
- **Reduces Cyclomatic Complexity Number (CCN)** - the original goal of the issue
- **Eliminates code duplication** - DRY principle applied
- **Easier maintenance** - future changes only need to be made in one place
- **Type safety** - added proper type hint for the callable parameter
- **No behavioral changes** - all 83 existing tests pass without modification
- **Comprehensive test coverage** - direct tests for all comparison methods and edge cases
- **Code quality** - passes all linting and formatting checks
<details>
<summary>Original prompt</summary>
>
> ----
>
> *This section details on the original issue you should resolve*
>
> <issue_title>[FEATURE] generalize functions in balance/balancedf_class.py</issue_title>
> <issue_description>a bunch of functions in
> balance/balancedf_class.py
>
> Follow the exact same pattern (other than one word change:
> staticmethod
> def _emd_BalanceDF(
> sample_BalanceDF: "BalanceDF",
> target_BalanceDF: "BalanceDF",
> aggregate_by_main_covar: bool = False,
> ) -> pd.Series:
> """Run EMD on two BalanceDF objects.
>
> Prepares the BalanceDF objects by passing them through :func:`_get_df_and_weights`, and
> then passes the df and weights into :func:`weighted_comparisons_stats.emd`.
>
> Args:
> sample_BalanceDF (BalanceDF): Object.
> target_BalanceDF (BalanceDF): Object.
> aggregate_by_main_covar (bool, optional): See :func:`weighted_comparisons_stats.emd`. Defaults to False.
>
> Returns:
> pd.Series: See :func:`weighted_comparisons_stats.emd`.
> """
> BalanceDF._check_if_not_BalanceDF(sample_BalanceDF, "sample_BalanceDF")
> BalanceDF._check_if_not_BalanceDF(target_BalanceDF, "target_BalanceDF")
>
> sample_df_values, sample_weights = sample_BalanceDF._get_df_and_weights()
> target_df_values, target_weights = target_BalanceDF._get_df_and_weights()
>
> return weighted_comparisons_stats.emd(
> sample_df_values,
> target_df_values,
> sample_weights,
> target_weights,
> aggregate_by_main_covar=aggregate_by_main_covar,
> )
>
>
>
> Extract this pattern to a helper function to reduce CCN.
>
> </issue_description>
>
> ## Comments on the Issue (you are copilot in this section)
>
> <comments>
> </comments>
>
</details>
- Fixes #266
---
💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey).
Pull Request resolved: #267
Differential Revision: D90870323
Pulled By: talgalili
fbshipit-source-id: 064e3878506e1f65726ce1302fc692e9ec7946761 parent c17914a commit c7fcb79
2 files changed
+200
-61
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| |||
1110 | 1110 | | |
1111 | 1111 | | |
1112 | 1112 | | |
| 1113 | + | |
| 1114 | + | |
| 1115 | + | |
| 1116 | + | |
| 1117 | + | |
| 1118 | + | |
| 1119 | + | |
| 1120 | + | |
| 1121 | + | |
| 1122 | + | |
| 1123 | + | |
| 1124 | + | |
| 1125 | + | |
| 1126 | + | |
| 1127 | + | |
| 1128 | + | |
| 1129 | + | |
| 1130 | + | |
| 1131 | + | |
| 1132 | + | |
| 1133 | + | |
| 1134 | + | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
| 1142 | + | |
| 1143 | + | |
| 1144 | + | |
| 1145 | + | |
| 1146 | + | |
| 1147 | + | |
| 1148 | + | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
| 1152 | + | |
| 1153 | + | |
| 1154 | + | |
| 1155 | + | |
| 1156 | + | |
1113 | 1157 | | |
1114 | 1158 | | |
1115 | 1159 | | |
| |||
1156 | 1200 | | |
1157 | 1201 | | |
1158 | 1202 | | |
1159 | | - | |
1160 | | - | |
1161 | | - | |
1162 | | - | |
1163 | | - | |
1164 | | - | |
1165 | | - | |
1166 | | - | |
1167 | | - | |
1168 | | - | |
1169 | | - | |
| 1203 | + | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
| 1207 | + | |
1170 | 1208 | | |
1171 | | - | |
1172 | 1209 | | |
1173 | 1210 | | |
1174 | 1211 | | |
| |||
1190 | 1227 | | |
1191 | 1228 | | |
1192 | 1229 | | |
1193 | | - | |
1194 | | - | |
1195 | | - | |
1196 | | - | |
1197 | | - | |
1198 | | - | |
1199 | | - | |
1200 | | - | |
1201 | | - | |
1202 | | - | |
1203 | | - | |
1204 | | - | |
| 1230 | + | |
| 1231 | + | |
| 1232 | + | |
| 1233 | + | |
| 1234 | + | |
1205 | 1235 | | |
1206 | 1236 | | |
1207 | 1237 | | |
| |||
1223 | 1253 | | |
1224 | 1254 | | |
1225 | 1255 | | |
1226 | | - | |
1227 | | - | |
1228 | | - | |
1229 | | - | |
1230 | | - | |
1231 | | - | |
1232 | | - | |
1233 | | - | |
1234 | | - | |
1235 | | - | |
1236 | | - | |
1237 | | - | |
| 1256 | + | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
1238 | 1261 | | |
1239 | 1262 | | |
1240 | 1263 | | |
| |||
1256 | 1279 | | |
1257 | 1280 | | |
1258 | 1281 | | |
1259 | | - | |
1260 | | - | |
1261 | | - | |
1262 | | - | |
1263 | | - | |
1264 | | - | |
1265 | | - | |
1266 | | - | |
1267 | | - | |
1268 | | - | |
1269 | | - | |
1270 | | - | |
| 1282 | + | |
| 1283 | + | |
| 1284 | + | |
| 1285 | + | |
| 1286 | + | |
1271 | 1287 | | |
1272 | 1288 | | |
1273 | 1289 | | |
| |||
1289 | 1305 | | |
1290 | 1306 | | |
1291 | 1307 | | |
1292 | | - | |
1293 | | - | |
1294 | | - | |
1295 | | - | |
1296 | | - | |
1297 | | - | |
1298 | | - | |
1299 | | - | |
1300 | | - | |
1301 | | - | |
1302 | | - | |
1303 | | - | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
1304 | 1313 | | |
1305 | 1314 | | |
1306 | 1315 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1301 | 1301 | | |
1302 | 1302 | | |
1303 | 1303 | | |
| 1304 | + | |
| 1305 | + | |
| 1306 | + | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
| 1314 | + | |
| 1315 | + | |
| 1316 | + | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
| 1323 | + | |
| 1324 | + | |
| 1325 | + | |
| 1326 | + | |
| 1327 | + | |
| 1328 | + | |
| 1329 | + | |
| 1330 | + | |
| 1331 | + | |
| 1332 | + | |
| 1333 | + | |
| 1334 | + | |
| 1335 | + | |
| 1336 | + | |
| 1337 | + | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
| 1344 | + | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
| 1348 | + | |
| 1349 | + | |
| 1350 | + | |
| 1351 | + | |
| 1352 | + | |
| 1353 | + | |
| 1354 | + | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
| 1361 | + | |
| 1362 | + | |
| 1363 | + | |
| 1364 | + | |
| 1365 | + | |
| 1366 | + | |
| 1367 | + | |
| 1368 | + | |
| 1369 | + | |
| 1370 | + | |
| 1371 | + | |
| 1372 | + | |
| 1373 | + | |
| 1374 | + | |
| 1375 | + | |
| 1376 | + | |
| 1377 | + | |
| 1378 | + | |
| 1379 | + | |
| 1380 | + | |
| 1381 | + | |
| 1382 | + | |
| 1383 | + | |
| 1384 | + | |
| 1385 | + | |
| 1386 | + | |
| 1387 | + | |
| 1388 | + | |
| 1389 | + | |
| 1390 | + | |
| 1391 | + | |
| 1392 | + | |
| 1393 | + | |
| 1394 | + | |
| 1395 | + | |
| 1396 | + | |
| 1397 | + | |
| 1398 | + | |
| 1399 | + | |
| 1400 | + | |
| 1401 | + | |
| 1402 | + | |
| 1403 | + | |
| 1404 | + | |
| 1405 | + | |
| 1406 | + | |
| 1407 | + | |
| 1408 | + | |
| 1409 | + | |
| 1410 | + | |
| 1411 | + | |
| 1412 | + | |
| 1413 | + | |
| 1414 | + | |
| 1415 | + | |
| 1416 | + | |
| 1417 | + | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
| 1426 | + | |
| 1427 | + | |
| 1428 | + | |
| 1429 | + | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
| 1433 | + | |
1304 | 1434 | | |
1305 | 1435 | | |
1306 | 1436 | | |
| |||
0 commit comments