@@ -565,6 +565,17 @@ Because the recombination rate in from position 10 to 20 is so much
565565higher than the rate from 0 to 10, we can see that all the recombinations
566566have fallen in the high recombination region.
567567
568+ :::{note}
569+ The units of recombination and other rates in msprime are
570+ ** per unit sequence length, per unit time** . If you have set
571+ effective population sizes in units of individuals and
572+ are measuring sequence lengths in base pairs, then the units
573+ for recombination rate will be per base pair, per generation.
574+ Other choices are possible (e.g., time in "coalescent units" and
575+ recombination breakpoints on a
576+ {ref}` continuous unit interval<sec_ancestry_discrete_genome> ` .)
577+ :::
578+
568579:::{seealso}
569580- See the {ref}` sec_rate_maps_creating ` section for more examples of
570581creating {class}` .RateMap ` instances.
@@ -700,7 +711,7 @@ simulations may be preferable as they will be much more efficient.
700711
701712That being said, this section describes how to simulate data over
702713multiple chromosomes, or more generally, over multiple regions
703- with free recombination between them.
714+ with free recombination between them.
704715
705716#### Using a fixed pedigree
706717
@@ -737,7 +748,7 @@ sequence length.
737748#### Simulations without a fixed pedigree
738749
739750Without a pedigree, we can use other simulation models in msprime such as
740- {ref}` DTWF <sec_ancestry_models_dtwf> ` and the
751+ {ref}` DTWF <sec_ancestry_models_dtwf> ` and the
741752{ref}` multiple merger coalescent <sec_ancestry_models_multiple_mergers> ` .
742753These models do not directly support simulating multiple chromosomes
743754simultaneously, but we can emulate it using a single linear genome split into
@@ -956,26 +967,26 @@ In `msprime` we usually want to simulate the coalescent with recombination
956967and represent the output as efficiently as possible. As a result, we don't
957968store individual recombination events, but rather their effects on the output
958969tree sequence. We also do not explicitly store common ancestor events that
959- do not result in marginal coalescences. For some purposes, however,
960- we want record information on other events of interest, not just the mimimal
970+ do not result in marginal coalescences. For some purposes, however,
971+ we want record information on other events of interest, not just the mimimal
961972representation of its outcome.
962973
963- The ` additional_nodes ` and
964- {ref}` coalescing_segments_only <sec_additional_nodes_cso> ` options serve
965- this exact purpose. These options allow us to record the nodes associated with
966- a custom subset of all events we might observe in the history of a sample.
967- Besides samples and coalescence events, nodes can now also represent
968- {ref}` common ancestor events <sec_additional_nodes_ca> ` ,
969- {ref}` recombination <sec_additional_nodes_re> ` ,
970- {ref}` gene conversion <sec_additional_nodes_re> ` ,
971- {ref}` migration <sec_ancestry_record_migrations> ` ,
972- and {ref}` pass through <sec_additional_nodes_re> ` events. The example below
973- and the next few paragraphs provide a guide on how
974+ The ` additional_nodes ` and
975+ {ref}` coalescing_segments_only <sec_additional_nodes_cso> ` options serve
976+ this exact purpose. These options allow us to record the nodes associated with
977+ a custom subset of all events we might observe in the history of a sample.
978+ Besides samples and coalescence events, nodes can now also represent
979+ {ref}` common ancestor events <sec_additional_nodes_ca> ` ,
980+ {ref}` recombination <sec_additional_nodes_re> ` ,
981+ {ref}` gene conversion <sec_additional_nodes_re> ` ,
982+ {ref}` migration <sec_ancestry_record_migrations> ` ,
983+ and {ref}` pass through <sec_additional_nodes_re> ` events. The example below
984+ and the next few paragraphs provide a guide on how
974985to record additional nodes and interpret the resulting node tables.
975986
976987We first set up a simple pedigree simulation. Pedigrees are an
977- interesting starting point as in contrast to the coalescent models a single
978- node might be associated with more than one type of event. Conversely,
988+ interesting starting point as in contrast to the coalescent models a single
989+ node might be associated with more than one type of event. Conversely,
979990for the continuous-time coalescent model simulated by default by msprime,
980991each event is registered as a new node.
981992
@@ -995,16 +1006,16 @@ pedigree_tables = pb.finalise(sequence_length=5)
9951006draw_pedigree(pedigree_tables.tree_sequence())
9961007```
9971008
998- To specify the particular node types we want to store information on, pass a
999- {class}` .NodeType ` object to the ` additional_nodes ` option of
1000- {func}` .sim_ancestry ` . This class extends {class}`` python:enum.Flag `` .
1001- Each node type in the enumeration is associated with a numerical constant.
1002- Different node types can be combined and compared using
1009+ To specify the particular node types we want to store information on, pass a
1010+ {class}` .NodeType ` object to the ` additional_nodes ` option of
1011+ {func}` .sim_ancestry ` . This class extends {class}`` python:enum.Flag `` .
1012+ Each node type in the enumeration is associated with a numerical constant.
1013+ Different node types can be combined and compared using
10031014[ bitwise operations] ( https://docs.python.org/3/library/stdtypes.html?highlight=bitwise ) .
1004- At the same time, these constant also function as the flags identifying
1005- the type of each node in the nodes table. Here, we take the bitwise union
1006- of three members of the enumeration:
1007- {class}` msprime.NodeType.RECOMBINANT ` , {class}` msprime.NodeType.PASS_THROUGH `
1015+ At the same time, these constant also function as the flags identifying
1016+ the type of each node in the nodes table. Here, we take the bitwise union
1017+ of three members of the enumeration:
1018+ {class}` msprime.NodeType.RECOMBINANT ` , {class}` msprime.NodeType.PASS_THROUGH `
10081019and {class}` msprime.NodeType.COMMON_ANCESTOR ` .
10091020
10101021``` {code-cell}
@@ -1015,7 +1026,7 @@ ts = msprime.sim_ancestry(
10151026 random_seed=45,
10161027 additional_nodes=(
10171028 msprime.NodeType.RECOMBINANT |
1018- msprime.NodeType.PASS_THROUGH |
1029+ msprime.NodeType.PASS_THROUGH |
10191030 msprime.NodeType.COMMON_ANCESTOR
10201031 ),
10211032 coalescing_segments_only=False
@@ -1032,20 +1043,20 @@ def count_flags(ts):
10321043print(count_flags(ts))
10331044```
10341045The ` count_flags ` function demonstrates the use of
1035- bitwise operations to determine the type of a node. This function iterates
1036- across all node types and checks for all rows in the nodes table whether
1037- the intersection of the basic node type and the node within the nodes table
1038- is different from 0. It returns a ` dict ` containing the counts of all possible
1046+ bitwise operations to determine the type of a node. This function iterates
1047+ across all node types and checks for all rows in the nodes table whether
1048+ the intersection of the basic node type and the node within the nodes table
1049+ is different from 0. It returns a ` dict ` containing the counts of all possible
10391050node types.
10401051
1041- Note that {ref}` recombination events <sec_additional_nodes_re> ` are
1042- associated with two nodes,
1043- one for both ends that are created by recombination backwards in time. The
1044- number of recombination events is thus half the number of recombination nodes.
1045- ` PASS_THROUGH ` events occur when the ancestral material of only a single
1046- lineage passes through the genome of an ancestor, making coalescence
1052+ Note that {ref}` recombination events <sec_additional_nodes_re> ` are
1053+ associated with two nodes,
1054+ one for both ends that are created by recombination backwards in time. The
1055+ number of recombination events is thus half the number of recombination nodes.
1056+ ` PASS_THROUGH ` events occur when the ancestral material of only a single
1057+ lineage passes through the genome of an ancestor, making coalescence
10471058impossible. This event can only be observed in models that track individuals:
1048- {class}`` msprime.DiscreteTimeWrightFisher `` and
1059+ {class}`` msprime.DiscreteTimeWrightFisher `` and
10491060{class}`` msprime.FixedPedigree `` models.
10501061
10511062``` {code-cell}
@@ -1064,48 +1075,48 @@ for node in ts.nodes():
10641075wide_fmt = (800, 250)
10651076ts.draw_svg(style=css_string, size=wide_fmt)
10661077```
1067- Similarly, the bitwise operations logic can be used to color the different nodes
1068- in the tree sequence according to their {class}` .NodeType ` . Node 5 is both
1069- {class}` msprime.NodeType.RECOMBINANT ` and
1070- {class}` msprime.NodeType.PASS_THROUGH ` (green). Nodes 0, 1, 4 are
1071- recombinant nodes (yellow). All other nodes are only associated with a
1072- ` PASS_THROUGH ` event (blue). Note that we do not observe any ` COMMON_ANCESTOR `
1073- events. This means that all yellow nodes are also associated with a
1074- coalesence event (see {ref}` common ancestor events <sec_additional_nodes_ca> ` ).
1075- Note that in order to verify whether a node is ` RECOMBINANT ` ** and** ` PASS_THROUGH ` ,
1078+ Similarly, the bitwise operations logic can be used to color the different nodes
1079+ in the tree sequence according to their {class}` .NodeType ` . Node 5 is both
1080+ {class}` msprime.NodeType.RECOMBINANT ` and
1081+ {class}` msprime.NodeType.PASS_THROUGH ` (green). Nodes 0, 1, 4 are
1082+ recombinant nodes (yellow). All other nodes are only associated with a
1083+ ` PASS_THROUGH ` event (blue). Note that we do not observe any ` COMMON_ANCESTOR `
1084+ events. This means that all yellow nodes are also associated with a
1085+ coalesence event (see {ref}` common ancestor events <sec_additional_nodes_ca> ` ).
1086+ Note that in order to verify whether a node is ` RECOMBINANT ` ** and** ` PASS_THROUGH ` ,
10761087we use the bitwise ** OR** to define its associated ` NodeType ` constant.
10771088
1078- To summarize: ` additional_nodes ` are specified using bitwise flags. Multiple
1079- basic node types can be combined by taking the bitwise ` OR ` (|) and then
1080- passed to the ` additional_nodes ` option. This enables us to track the specified
1081- nodes across the genealogical history of the sample. By means of bitwise ` AND `
1082- (&) we can then query the nodes table and check the type of each of the recorded
1089+ To summarize: ` additional_nodes ` are specified using bitwise flags. Multiple
1090+ basic node types can be combined by taking the bitwise ` OR ` (|) and then
1091+ passed to the ` additional_nodes ` option. This enables us to track the specified
1092+ nodes across the genealogical history of the sample. By means of bitwise ` AND `
1093+ (&) we can then query the nodes table and check the type of each of the recorded
10831094nodes.
10841095
1085- If we wish to reduce these trees down to the minimal representation, we can use
1086- {meth}` tskit.TreeSequence.simplify ` . The resulting tree sequence will have
1087- all of these unary nodes removed and will be equivalent to (but not identical,
1088- due to stochastic effects) calling {func}` .sim_ancestry ` without the
1096+ If we wish to reduce these trees down to the minimal representation, we can use
1097+ {meth}` tskit.TreeSequence.simplify ` . The resulting tree sequence will have
1098+ all of these unary nodes removed and will be equivalent to (but not identical,
1099+ due to stochastic effects) calling {func}` .sim_ancestry ` without the
10891100` additional_nodes ` or ` coalescing_segment_only ` argument(s).
10901101
10911102
10921103(sec_additional_nodes_cso)=
10931104
10941105### Coalescing segments only
10951106
1096- Note that the ` coalescing_segments_only ` option should be set to ` False ` when
1097- recording additional nodes. Setting ` coalescing_segments_only ` to ` False `
1098- allows us to switch off the default simulator behaviour of only recording the
1099- relationships between overlapping ancestry segments. Instead, you can now
1100- record edges along the full extent of any of the ancestral lineages
1101- that was involved in any of the events as specified by the ` additional_nodes ` flag.
1102- This option can also be set to ` False ` without specifying any additional nodes.
1107+ Note that the ` coalescing_segments_only ` option should be set to ` False ` when
1108+ recording additional nodes. Setting ` coalescing_segments_only ` to ` False `
1109+ allows us to switch off the default simulator behaviour of only recording the
1110+ relationships between overlapping ancestry segments. Instead, you can now
1111+ record edges along the full extent of any of the ancestral lineages
1112+ that was involved in any of the events as specified by the ` additional_nodes ` flag.
1113+ This option can also be set to ` False ` without specifying any additional nodes.
11031114When ` coalescing_segments_only ` is set to ` False ` ,
11041115edges that are a result of a coalescent event will record
11051116the full length of the tracked ancestral material,
11061117not just the overlapping segments of genome (as is the default behavior.
1107- As a result, this option will produce
1108- in nodes with only one child (unary nodes) along parts (unary regions) of
1118+ As a result, this option will produce
1119+ in nodes with only one child (unary nodes) along parts (unary regions) of
11091120the genome.
11101121
11111122For instance: suppose an ancestral segment ancestral to node ` m ` spans from ` a ` to ` b ` along the genome,
@@ -1122,30 +1133,30 @@ The nodes `m` and `n` coalesce (in `p`) on only the overlapping segment `[c,b)`,
11221133
11231134### Common ancestor events
11241135
1125- We distinguish two types of common ancestor events: coalescence and common
1136+ We distinguish two types of common ancestor events: coalescence and common
11261137ancestor (in the strict sense) events.
11271138
1128- Coalescence events are the default node type (associated with value 0) and are
1129- therefore only implicitly part of the {class}` .NodeType ` enumeration.
1139+ Coalescence events are the default node type (associated with value 0) and are
1140+ therefore only implicitly part of the {class}` .NodeType ` enumeration.
11301141In contrast, whenever two nonoverlapping ancestral segments
11311142are found to have inherited from the same ancestral genome,
1132- there is no coalescence event, and so we register this in the node table as
1143+ there is no coalescence event, and so we register this in the node table as
11331144{class}` msprime.NodeType.COMMON_ANCESTOR ` .
1134- Finally, when keeping track of individuals
1135- ({class}`` msprime.DiscreteTimeWrightFisher `` ,
1136- {class}`` msprime.FixedPedigree `` ), we might observe genomes (ploids) through
1137- which the ancestral material of only a single lineage passes.
1138- Such genomes can be registered in the table as
1145+ Finally, when keeping track of individuals
1146+ ({class}`` msprime.DiscreteTimeWrightFisher `` ,
1147+ {class}`` msprime.FixedPedigree `` ), we might observe genomes (ploids) through
1148+ which the ancestral material of only a single lineage passes.
1149+ Such genomes can be registered in the table as
11391150{class}` msprime.NodeType.PASS_THROUGH ` nodes.
1140- Although this is obviously not a common ancestor event, note that for both
1151+ Although this is obviously not a common ancestor event, note that for both
11411152of these models each node has to carry at least one of these three flags.
11421153
11431154
11441155(sec_additional_nodes_re)=
11451156
11461157### Recombination events
11471158
1148- Additional nodes can also be used to mark recombination events (cross over) as well
1159+ Additional nodes can also be used to mark recombination events (cross over) as well
11491160as gene conversion. A separate {class}` msprime.NodeType ` exists for each:
11501161{class}` msprime.NodeType.RECOMBINANT ` and {class}` msprime.NodeType.GENE_CONVERSION ` .
11511162
@@ -1190,7 +1201,7 @@ migrating segement. For more details on migration records, see the
11901201Here, we provide a simple example of the effect of setting ` record_migrations=True `
11911202using the {meth}` stepping stone model<msprime.Demography.stepping_stone_model> `
11921203where migration is permitted between adjacent populations. Additionally, we
1193- set ` additional_nodes=msprime.NodeType.MIGRANT `
1204+ set ` additional_nodes=msprime.NodeType.MIGRANT `
11941205(see {ref}` previous section <sec_ancestry_additional_nodes> ` )
11951206to record nodes corresponding to migration events. This is not necessary, but will be
11961207helpful to visualise the result.
@@ -1349,7 +1360,7 @@ two of these haplotypes (nodes 12 and 13) were in population 1 at this time.
13491360
13501361``` {code-cell}
13511362print(ts.tables.nodes)
1352- census_nodes = np.bitwise_and(ts.nodes_flags, msprime.NodeType.CENSUS.value) > 0
1363+ census_nodes = np.bitwise_and(ts.nodes_flags, msprime.NodeType.CENSUS.value) > 0
13531364print(ts.tables.nodes[census_nodes])
13541365```
13551366
@@ -1417,7 +1428,7 @@ SVG(ts.draw_svg(y_axis=True, time_scale="log_time"))
14171428### Ancestral recombination graph
14181429
14191430This is a legacy option and identical to setting the ` additional_nodes ` option to store
1420- common_ancestor events, recombinants, (and migrants if applicable). This is illustrated
1431+ common_ancestor events, recombinants, (and migrants if applicable). This is illustrated
14211432in the following example:
14221433
14231434``` {code-cell}
0 commit comments