Skip to content

Commit 7a0b863

Browse files
lkirkmergify[bot]
authored andcommitted
Add the outer ld_matrix python code
Overall, the goal was to mirror other sample set stat design patterns, with the exception that we dispatch each low level stat matrix from the ld_matrix function. We also fix some behavior uncovered in testing: * Ensure the D_prime statistic does not divide by zero. (opened #2907 to investigate if this something we actually want to do). * Look up node from sample_index_map when finding samples under mutation nodes. * Do not error out if no sites are specified, instead return an array with (0, 0) dimensions. Remove associated test, add one for early return. Exhaustive test coverage was added, but more could be done to validate the results from these tests: Add LD matrix tests, implementing the rest of the summary functions. Mirror the ld_matrix behavior. More thorough testing of various data partitions. Compare to LD calculator, and use all of the example tree sequences from test_highlevel.
1 parent 04e04aa commit 7a0b863

File tree

4 files changed

+398
-54
lines changed

4 files changed

+398
-54
lines changed

c/tests/test_stats.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2583,9 +2583,10 @@ test_two_locus_stat_input_errors(void)
25832583
row_sites[0] = 0;
25842584
row_sites[1] = 1;
25852585

2586+
// Not an error condition, but we want to record this behavior
25862587
ret = tsk_treeseq_r2(&ts, num_sample_sets, sample_set_sizes, sample_sets, 0, NULL, 0,
25872588
NULL, 0, result);
2588-
CU_ASSERT_EQUAL_FATAL(ret, TSK_ERR_BAD_SITE_POSITION);
2589+
CU_ASSERT_EQUAL_FATAL(ret, 0);
25892590

25902591
tsk_treeseq_free(&ts);
25912592
tsk_safe_free(row_sites);

c/tskit/trees.c

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2471,8 +2471,8 @@ get_mutation_samples(const tsk_treeseq_t *ts, const tsk_id_t *sites, tsk_size_t
24712471
for (n = 0; n < num_nodes; n++) {
24722472
node = nodes[n];
24732473
if (flags[node] & TSK_NODE_IS_SAMPLE) {
2474-
tsk_bit_array_add_bit(
2475-
&mut_samples_row, (tsk_bit_array_value_t) node);
2474+
tsk_bit_array_add_bit(&mut_samples_row,
2475+
(tsk_bit_array_value_t) ts->sample_index_map[node]);
24762476
}
24772477
}
24782478
}
@@ -2612,9 +2612,8 @@ check_sites(const tsk_id_t *sites, tsk_size_t num_sites, tsk_size_t num_site_row
26122612
int ret = 0;
26132613
tsk_size_t i;
26142614

2615-
if (sites == NULL || num_sites == 0) {
2616-
ret = TSK_ERR_BAD_SITE_POSITION; // TODO: error should be no sites?
2617-
goto out;
2615+
if (num_sites == 0) {
2616+
return ret; // No need to verify sites if there aren't any
26182617
}
26192618

26202619
for (i = 0; i < num_sites - 1; i++) {
@@ -3638,9 +3637,10 @@ D_prime_summary_func(tsk_size_t state_dim, const double *state,
36383637
double p_B = p_AB + p_aB;
36393638

36403639
double D = p_AB - (p_A * p_B);
3641-
if (D >= 0) {
3640+
result[j] = 0;
3641+
if (D > 0) {
36423642
result[j] = D / TSK_MIN(p_A * (1 - p_B), (1 - p_A) * p_B);
3643-
} else {
3643+
} else if (D < 0) {
36443644
result[j] = D / TSK_MIN(p_A * p_B, (1 - p_A) * (1 - p_B));
36453645
}
36463646
}

0 commit comments

Comments
 (0)