ARGs and SPRs #77

jeromekelleher · 2022-02-08T16:38:18Z

jeromekelleher
Feb 8, 2022
Maintainer

I was interested to see what the relationship between ARGs in the various states of simplification and SPRs is. The result is interesting I think.

Full/implicit ARG

Consider the "full" ARG first:

import argutils

ts = argutils.sim_coalescent(4, L=50, rho=0.01, seed=2, resolved=False)
breakpoints = [node.metadata["breakpoint"] for node in ts.nodes() if
    (node.flags & argutils.NODE_IS_RECOMB) != 0]
# Make sure we have no multiple recombinations for simplicity
assert len(breakpoints) == len(set(breakpoints))

print(ts.draw_text())

for interval, edges_out, edges_in in ts.edge_diffs():
    print(interval, len(edges_out), len(edges_in))

which gives us:

1.81┊      15        ┊         15     ┊         15     ┊          15    ┊ 
    ┊   ┏━━━┻━━━┓    ┊      ┏━━━┻━━━┓ ┊      ┏━━━┻━━━┓ ┊       ┏━━━┻━━┓ ┊ 
1.78┊   ┃      14    ┊     14       ┃ ┊     14       ┃ ┊      14      ┃ ┊ 
    ┊   ┃     ┏━┻━━┓ ┊   ┏━━┻━━┓    ┃ ┊   ┏━━┻━━┓    ┃ ┊     ┏━┻━━┓   ┃ ┊ 
1.54┊  12     ┃   13 ┊  13     ┃   12 ┊  13     ┃   12 ┊     ┃   13  12 ┊ 
    ┊   ┃     ┃      ┊   ┃     ┃      ┊   ┃     ┃      ┊     ┃    ┃     ┊ 
1.07┊  11     ┃      ┊  11     ┃      ┊  11     ┃      ┊     ┃   11     ┊ 
    ┊  ┏┻━┓   ┃      ┊  ┏┻━┓   ┃      ┊  ┏┻━┓   ┃      ┊     ┃   ┏┻┓    ┊ 
0.84┊  ┃  ┃  10      ┊  ┃  ┃  10      ┊  ┃  ┃  10      ┊    10   ┃ ┃    ┊ 
    ┊  ┃  ┃  ┏┻━┓    ┊  ┃  ┃  ┏┻━┓    ┊  ┃  ┃  ┏┻━┓    ┊   ┏━┻━┓ ┃ ┃    ┊ 
0.63┊  ┃  ┃  8  9    ┊  ┃  ┃  8  9    ┊  ┃  ┃  9  8    ┊   9   8 ┃ ┃    ┊ 
    ┊  ┃  ┃  ┃       ┊  ┃  ┃  ┃       ┊  ┃  ┃  ┃       ┊   ┃     ┃ ┃    ┊ 
0.60┊  ┃  ┃  7       ┊  ┃  ┃  7       ┊  ┃  ┃  7       ┊   7     ┃ ┃    ┊ 
    ┊  ┃  ┃ ┏┻┓      ┊  ┃  ┃ ┏┻┓      ┊  ┃  ┃ ┏┻┓      ┊  ┏┻━┓   ┃ ┃    ┊ 
0.55┊  5  ┃ ┃ 6      ┊  5  ┃ ┃ 6      ┊  5  ┃ ┃ 6      ┊  6  ┃   ┃ 5    ┊ 
    ┊  ┃  ┃ ┃        ┊  ┃  ┃ ┃        ┊  ┃  ┃ ┃        ┊  ┃  ┃   ┃      ┊ 
0.49┊  4  ┃ ┃        ┊  4  ┃ ┃        ┊  4  ┃ ┃        ┊  4  ┃   ┃      ┊ 
    ┊ ┏┻┓ ┃ ┃        ┊ ┏┻┓ ┃ ┃        ┊ ┏┻┓ ┃ ┃        ┊ ┏┻┓ ┃   ┃      ┊ 
0.00┊ 0 1 2 3        ┊ 0 1 2 3        ┊ 0 1 2 3        ┊ 0 1 3   2      ┊ 
    0               28               41               43               50 

Interval(left=0.0, right=28.0) 0 15
Interval(left=28.0, right=41.0) 1 1
Interval(left=41.0, right=43.0) 1 1
Interval(left=43.0, right=50.0) 1 1

So, it takes us 15 edge insertions to build the first tree, and then we only insert and remove one edge for each tree transition (SPR). For example, in the first transition we remove the edge (11, 12) and insert (11, 13). This is straightforwardly the consequence of the recombination that occured on node 11, with left_parent 12 and right_parent 13.

So, I think it's probably true that every tree transition requires exactly 1 edge change, but the first tree is very large because of all the non ancestral topology that we're maintaining.

Reduced/explicit ARG

Consider a different example in which we get rid of the non-ancestral topology:

3.32┊      26   ┊      26   ┊      26   ┊      26   ┊      26   ┊      26   ┊      26   ┊ 
    ┊     ┏━┻━┓ ┊       ┃   ┊       ┃   ┊     ┏━┻━┓ ┊       ┃   ┊       ┃   ┊       ┃   ┊ 
1.69┊     ┃  25 ┊      25   ┊      25   ┊     ┃  25 ┊      25   ┊      25   ┊      25   ┊ 
    ┊     ┃   ┃ ┊       ┃   ┊       ┃   ┊     ┃   ┃ ┊     ┏━┻━┓ ┊     ┏━┻━┓ ┊     ┏━┻━┓ ┊ 
1.66┊    23   ┃ ┊       ┃   ┊       ┃   ┊    23   ┃ ┊    24   ┃ ┊    24   ┃ ┊    24   ┃ ┊ 
    ┊     ┃   ┃ ┊       ┃   ┊       ┃   ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
1.62┊     ┃  22 ┊      22   ┊      22   ┊     ┃  22 ┊     ┃  22 ┊     ┃  22 ┊     ┃  22 ┊ 
    ┊     ┃   ┃ ┊     ┏━┻━┓ ┊     ┏━┻━┓ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
1.52┊    21   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊    21   ┃ ┊    21   ┃ ┊    21   ┃ ┊    21   ┃ ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
1.43┊    20   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊    20   ┃ ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
1.34┊    18   ┃ ┊    19   ┃ ┊    19   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
1.18┊     ┃  17 ┊     ┃  17 ┊     ┃  17 ┊     ┃  17 ┊     ┃  17 ┊     ┃  17 ┊     ┃  17 ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
1.08┊     ┃  16 ┊     ┃  16 ┊     ┃  16 ┊     ┃  16 ┊     ┃  16 ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
1.00┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊    14   ┃ ┊    14   ┃ ┊    14   ┃ ┊    15   ┃ ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
0.99┊     ┃  12 ┊     ┃  12 ┊     ┃  13 ┊     ┃  13 ┊     ┃  13 ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
0.93┊     ┃  10 ┊     ┃  10 ┊     ┃  10 ┊     ┃  10 ┊     ┃  10 ┊     ┃  11 ┊     ┃  11 ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
0.89┊     8   ┃ ┊     8   ┃ ┊     8   ┃ ┊     9   ┃ ┊     9   ┃ ┊     9   ┃ ┊     9   ┃ ┊ 
    ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊     ┃   ┃ ┊ 
0.70┊     7   ┃ ┊     7   ┃ ┊     7   ┃ ┊     7   ┃ ┊     7   ┃ ┊     7   ┃ ┊     7   ┃ ┊ 
    ┊   ┏━┻━┓ ┃ ┊   ┏━┻━┓ ┃ ┊   ┏━┻━┓ ┃ ┊   ┏━┻━┓ ┃ ┊   ┏━┻━┓ ┃ ┊   ┏━┻━┓ ┃ ┊   ┏━┻━┓ ┃ ┊ 
0.33┊   6   ┃ ┃ ┊   6   ┃ ┃ ┊   6   ┃ ┃ ┊   6   ┃ ┃ ┊   6   ┃ ┃ ┊   6   ┃ ┃ ┊   6   ┃ ┃ ┊ 
    ┊  ┏┻━┓ ┃ ┃ ┊  ┏┻━┓ ┃ ┃ ┊  ┏┻━┓ ┃ ┃ ┊  ┏┻━┓ ┃ ┃ ┊  ┏┻━┓ ┃ ┃ ┊  ┏┻━┓ ┃ ┃ ┊  ┏┻━┓ ┃ ┃ ┊ 
0.28┊  5  ┃ ┃ ┃ ┊  5  ┃ ┃ ┃ ┊  5  ┃ ┃ ┃ ┊  5  ┃ ┃ ┃ ┊  5  ┃ ┃ ┃ ┊  5  ┃ ┃ ┃ ┊  5  ┃ ┃ ┃ ┊ 
    ┊ ┏┻┓ ┃ ┃ ┃ ┊ ┏┻┓ ┃ ┃ ┃ ┊ ┏┻┓ ┃ ┃ ┃ ┊ ┏┻┓ ┃ ┃ ┃ ┊ ┏┻┓ ┃ ┃ ┃ ┊ ┏┻┓ ┃ ┃ ┃ ┊ ┏┻┓ ┃ ┃ ┃ ┊ 
0.00┊ 0 4 2 3 1 ┊ 0 4 2 3 1 ┊ 0 4 2 3 1 ┊ 0 4 2 3 1 ┊ 0 4 2 3 1 ┊ 0 4 2 3 1 ┊ 0 4 2 3 1 ┊ 
    0           4           7           8          11          22          33          50

Consider the first transition, in which node 8 switches from left parent 18 to right parent 19 (that is a child of 22). For this transition we need to remove five edges and to insert two. We need to remove 5 edges to get rid of the unary chain from 18 (which is no longer ancestral) up to 26.

I think this means that we can have arbitrarily many diffs between neighbouring trees, if we're working with this resolved graph.

Fully simplified

We know from Kelleher et al 2016 (Fig 5) that the number of edges changing (in and out) between two adjacent trees is 2 (modulo changing from the coalescence record to edge format).

jeromekelleher · 2022-02-08T16:40:21Z

jeromekelleher
Feb 8, 2022
Maintainer Author

I'm not sure what we should do with this in terms of the paper, but I thought it was interesting and worth recording here.

0 replies

jeromekelleher · 2022-02-08T17:27:05Z

jeromekelleher
Feb 8, 2022
Maintainer Author

I think this means that we can have arbitrarily many diffs between neighbouring trees, if we're working with this resolved graph.

Thinking about this a bit more, I think this corresponds to having to backtrack through an arbitrary number of nodes in the graph, if you're following the Griffiths traversal rules for generating trees. So it's surely an unavoidable property of working with this graph.

2 replies

hyanwong Feb 9, 2022
Maintainer

I'm not sure I quite understand here. The edges 18, 20, 21, 32 that are removed when moving to the 2nd tree would not be traversed by the Griffiths algorithm once the path through 18 had been cut, would they?

jeromekelleher Feb 9, 2022
Maintainer Author

I'm not sure what I meant here tbh. Need to think about that a bit more...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ARGs and SPRs #77

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

ARGs and SPRs #77

Uh oh!

jeromekelleher Feb 8, 2022 Maintainer

Full/implicit ARG

Reduced/explicit ARG

Fully simplified

Replies: 2 comments · 2 replies

Uh oh!

jeromekelleher Feb 8, 2022 Maintainer Author

Uh oh!

jeromekelleher Feb 8, 2022 Maintainer Author

Uh oh!

hyanwong Feb 9, 2022 Maintainer

Uh oh!

jeromekelleher Feb 9, 2022 Maintainer Author

jeromekelleher
Feb 8, 2022
Maintainer

Replies: 2 comments 2 replies

jeromekelleher
Feb 8, 2022
Maintainer Author

jeromekelleher
Feb 8, 2022
Maintainer Author

hyanwong Feb 9, 2022
Maintainer

jeromekelleher Feb 9, 2022
Maintainer Author