9393{class}` .StandardCoalescent `
9494: Coalescent with recombination ("hudson")
9595
96- {class}` .SmcApproxCoalescent `
97- : Sequentially Markov Coalescent ("smc")
98-
99- {class}` .SmcPrimeApproxCoalescent `
100- : SMC'("smc_prime")
96+ {class}` .SmcKApproxCoalescent `
97+ : General Sequentially Markov Coalescent
10198
10299{class}` .DiscreteTimeWrightFisher `
103100: Generation-by-generation Wright-Fisher
@@ -1975,15 +1972,16 @@ ancestry model. By default, we run simulations under the
19751972{class}` .StandardCoalescent ` model. If we wish to run
19761973under a different model, we use the `` model `` argument to
19771974{func}` .sim_ancestry ` . For example, here we use the
1978- {class}` SMC<.SmcApproxCoalescent >` model instead of the
1975+ {class}` dtwf<.DiscreteTimeWrightFisher >` model instead of the
19791976standard coalescent:
19801977
19811978``` {code-cell}
19821979ts1 = msprime.sim_ancestry(
19831980 10,
19841981 sequence_length=10,
1982+ population_size=100,
19851983 recombination_rate=0.1,
1986- model=msprime.SmcApproxCoalescent (),
1984+ model=msprime.DiscreteTimeWrightFisher (),
19871985 random_seed=1234)
19881986```
19891987
@@ -1996,7 +1994,8 @@ ts2 = msprime.sim_ancestry(
19961994 10,
19971995 sequence_length=10,
19981996 recombination_rate=0.1,
1999- model="smc",
1997+ population_size=100,
1998+ model="dtwf",
20001999 random_seed=1234)
20012000assert ts1.equals(ts2, ignore_provenance=True)
20022001```
@@ -2231,21 +2230,45 @@ in units of 4N generations.
22312230
22322231### SMC approximations
22332232
2234- The {class} ` SMC <.SmcApproxCoalescent> ` and {class} ` SMC' <.SmcPrimeApproxCoalescent> `
2233+ The ** SMC** and ** SMC′ **
22352234are approximations of the continuous time
22362235{ref}` Hudson coalescent<sec_ancestry_models_hudson> ` model. These were originally
22372236motivated largely by the need to simulate coalescent processes more efficiently
22382237than was possible using the software available at the time; however,
22392238[ improved algorithms] ( https://doi.org/10.1371/journal.pcbi.1004842 )
2240- mean that such approximations are now mostly unnecessary for simulations.
2241-
2242- The SMC and SMC' are however very important for inference, as the approximations
2243- have made many analytical advances possible.
2244-
2245- Since the SMC approximations are not required for simulation efficiency, these
2246- models are implemented using a naive rejection sampling approach in msprime.
2247- The implementation is intended to facilitate the study of the
2248- SMC approximations, rather than to be used in a general-purpose way.
2239+ mean that such approximations are now unnecessary for many simulations.
2240+
2241+ The ** SMC** and ** SMC'** are, however, very important for inference, as the approximations
2242+ have made many analytical advances possible. Moreover, using these approximations,
2243+ we are able to simulate regimes which we couldn't simulate otherwise: for example,
2244+ ** Drosophila** and ** Drosophila-like** simulations with very high scaled recombination rates.
2245+
2246+
2247+ The {class}` SMC(k) <.SmcKApproxCoalescent> ` model is a general simulations model that can simulate various ** SMC** approximations
2248+ (e.g., ** SMC** and ** SMC′** ). It accepts a ``` hull_offset ``` parameter, which defines the extent of
2249+ ** SMC** approximations in the simulation. The ``` hull_offset ``` represents the maximum allowed
2250+ distance between two genomic segments that can share a common ancestor. Setting the
2251+ ``` hull_offset ``` to ** 0** means only overlapping genomic segments can share a common ancestor,
2252+ corresponding to the backward-in-time definition of the ** SMC** model. Similarly, setting
2253+ the ``` hull_offset ``` to ** 1** allows adjacent genomic segments, as well as overlapping ones, to
2254+ share a common ancestor, which defines the ** SMC′** model. Simulating under the Hudson
2255+ coalescent model is equivalent to setting the ``` hull_offset ``` to the sequence length. The
2256+ hull_offset can take any value between ** 0** and the sequence length.
2257+
2258+ In this example, we use the {class}` SMC(k) <.SmcKApproxCoalescent> ` model to run ** SMC'**
2259+ simulations:
2260+ ``` {code-cell}
2261+ ts = msprime.sim_ancestry(4, population_size=10,
2262+ model=msprime.SmcKApproxCoalescent(hull_offset=1),
2263+ random_seed=1)
2264+ SVG(ts.draw_svg(y_axis=True, time_scale="log_time"))
2265+ ```
2266+ :::{Note}
2267+ Since the ** SMC** models are approximations of the {ref}` Hudson coalescent<sec_ancestry_models_hudson> ` ,
2268+ and since the {ref}` Hudson coalescent<sec_ancestry_models_hudson> ` model is well optimised for
2269+ regimes with moderate scaled recombination rates (including full human chromosome simulations),
2270+ we recommend using the {ref}` Hudson coalescent<sec_ancestry_models_hudson> ` whenever possible.
2271+ :::
22492272
22502273(sec_ancestry_models_dtwf)=
22512274
0 commit comments