Replies: 6 comments 18 replies
-
Interesting... I wonder if it's worth trying this again with a more trustworthy set of recombinants? I think there's a large set of correlated recombinants around the same time which are very likely due to bioinformatics issues, and given the numbers are quite small these could be skewing the signal significantly. I think a mixture of max_run_length and averted_mutations is the best way to get at a reliable set, but I don't really know what values to suggest. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Filtering out the recombinants where the breakpoint is near large chunks of missing data and plotting this next to the diversity gives a slightly different perspective I think. Here's what we get when we focus on the subset of 576 recombinants where I've not done any stats here, but it looks to me like the peak in recombinants detected occurs roughly where BA.1 and BA.2 are maximally coexisting? In contrast, here's what we get when we look at the whole thing: The data for the sample composition is here. The columns of interest are date, scorpio and total (the total number of samples processed). |
Beta Was this translation helpful? Give feedback.
-
I've updated the plots in #423, and @hyanwong was right about the nice pleak at BA.1 and BA.2 being artefactual. Here's the updated plot for all(ish) 386 high quality recombinants: Here's the follow-up zoomed in and with the relative fractions of the Scorpio lineages: |
Beta Was this translation helpful? Give feedback.
-
The graphs have now been updated in the manuscript and supporting text added, both to the results and to the Star Methods. |
Beta Was this translation helpful? Give feedback.
-
@hyanwong - Here is the regression against het*cases: ![]() for comparison to the current figure: ![]() Let's wait until you read it over to decide about adding a third regression figure or not. I did add that the p-values were also assessed by randomly permuting the predictor, to avoid concerns about outliers. @jeromekelleher - let me know when you have a minute to add the estimated date of the recombination event to the csv, and I'll rerun. Thanks! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I explored the timing of recombination events (929 events by date_added). The recombinants (red) generally occur before the major rise in global cases (blue):
The shift to earlier time points is easier to see on a cumulative plot:
Squaring the number of cases doesn't help the fit (purple), and assuming that the first case happened earlier makes it worse (not shown). [The maximum likelihood fit is actually to cases^0.6, i.e., to a slightly more uniform distribution than the cases themselves.]
The peak in cases (red) happens during the Delta wave, which was well known to be undercounted, with massive case numbers in India unreported. Indeed, looking at the deaths (black), paints an entirely different picture, with recombinants now after the majority of deaths.
Death data is problematic though -- deaths per case rose with Alpha and Delta and then fell with Omicron and vaccines, making it a very imperfect measure of cases.
I'm thus unsure how much weight to put on any of this, lacking solid case data.
One remaining possibility that might be worth exploring is comparing the timing of recombinants with diversity (e.g., phylogenetic diversity among the sequences added each day). This is less about when we expect recombination to happen, and more about the ability to detect those recombinants happening.
Beta Was this translation helpful? Give feedback.
All reactions