-
Notifications
You must be signed in to change notification settings - Fork 9
Generalize recombination map support to all scripts #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
singer_master already supports it.
f50ba81 to
31a0f2a
Compare
| arg = ARG(Ne, sequence_length); | ||
| arg.discretize(bin_size); | ||
| arg.build_singleton_arg(n); | ||
| if (mut_rate > 0 and recomb_rate > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was always true, because of lines 20 and 21 above.
SINGER takes -start which allows you to run on a segment of the genome other than starting at basepair=0. Previously the Rate_map ignored this, meaning it was emitting recombination/mutation rates for the wrong region. It could also end up producing queries for coordinates that were prior to the start of the rate map, due to mutations being offset incorrectly.
| double p = (actualCoordinate - coordinates[index])/(coordinates[index+1] - coordinates[index]); | ||
| double dist = (1-p)*prev_dist + p*next_dist; | ||
| return dist; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likely Rate_map::mean_rate() also needs to be adjusted based on the region between start_pos and end_pos instead of taking the mean of the entire map. hopefully it is not a big deal, given that parallel_singer uses regions of 1Mbp that hopefully have means close to the full mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix makes sense (to me) because the alternative is that user has to adjust the recombination map many times, which could be cumbersome.
AprilWei001
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix makes sense (to me) because the alternative is that a user has to adjust the recombination map many times, which could be cumbersome.
| double p = (actualCoordinate - coordinates[index])/(coordinates[index+1] - coordinates[index]); | ||
| double dist = (1-p)*prev_dist + p*next_dist; | ||
| return dist; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix makes sense (to me) because the alternative is that user has to adjust the recombination map many times, which could be cumbersome.
Having recombination rate of 0 causes NaNs in downstream probabilities, and forbidding SINGER from allowing recombination is likely difficult to do. Instead, just impose a very small recombination rate in any region that has too low of a rate.
AprilWei001
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me that this needs to be fixed.
A design choice for Yun to decide is perhaps whether it's best to fix it here or at the place where recombination events are being sampled.
There were some conflicting argument checks that made it challenging (impossible?) to use recombination map support. Also there was an early return in
singer_masterthat prevented the recombination map code from being reachable.This makes it so that you have to pass both
-recomb_mapand-mut_maptogther, or neither of them in which case-mis required. This seemed to be the expected existing behavior. Updatedparallel_singerto also support rate map arguments.There is still a bug that I have not fixed, which is the RateMap read will silently fail if you pass in a incorrectly formatted rate map file.