Skip to content

Commit 25f0db1

Browse files
committed
multi-chrom revisions and other fixes for the calc...() popgen suite
1 parent 0ab81aa commit 25f0db1

File tree

9 files changed

+436
-168
lines changed

9 files changed

+436
-168
lines changed

QtSLiM/help/SLiMHelpFunctions.html

Lines changed: 12 additions & 10 deletions
Large diffs are not rendered by default.

SLiMgui/SLiMHelpFunctions.rtf

Lines changed: 68 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1136,6 +1136,7 @@ If
11361136
\f1\fs18 initializeChromosome()
11371137
\f2\fs20 , allowing a different mutation run count to be specified for each chromosome in multi-chromosome models.\expnd0\expndtw0\kerning0
11381138
\
1139+
\pard\pardeftab720\li547\ri720\sb60\sa60\partightenfactor0
11391140
\cf0 \kerning1\expnd0\expndtw0 If
11401141
\f1\fs18 preventIncidentalSelfing
11411142
\f2\fs20 is
@@ -1202,6 +1203,7 @@ If
12021203
\f2\fs20 , the order of individuals returned will be non-random (regardless of the setting of this option); you should use
12031204
\f1\fs18 sample()
12041205
\f2\fs20 to shuffle the order of the individuals vector if necessary to avoid order-dependency issues in your script.\
1206+
\pard\pardeftab720\li547\ri720\sb60\sa60\partightenfactor0
12051207
\cf0 This function will likely be extended with further options in the future, added on to the end of the argument list. Using named arguments with this call is recommended for readability. Note that turning on optional features may increase the runtime and memory footprint of SLiM.\
12061208
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
12071209

@@ -1978,6 +1980,19 @@ The code for
19781980
\f2\fs20 are first averaged across all specified mutations prior to taking the ratio of the two. This ratio of averages is less biased than the average of ratios, and and is generally considered to be best practice (see, e.g., Bhatia et al., 2013). This means that the behavior of
19791981
\f1\fs18 calcFST()
19801982
\f2\fs20 differs between SLiM 3 and SLiM 4.\
1983+
As can be seen from its equation, the
1984+
\f3\i F
1985+
\f2\i0\fs13\fsmilli6667 \sub ST
1986+
\fs20 \nosupersub is undefined if
1987+
\f3\i H
1988+
\fs13\fsmilli6667 \sub T
1989+
\f2\i0\fs20 \nosupersub is zero, which occurs if no mutations are present in the haplosomes provided (given the optionally specified window and set of mutations). In that case,
1990+
\f1\fs18 calcFST()
1991+
\f2\fs20 will return
1992+
\f1\fs18 NAN
1993+
\f2\fs20 . It is up to the caller to detect this with
1994+
\f1\fs18 isNAN()
1995+
\f2\fs20 and handle it as necessary.\
19811996
The implementation of
19821997
\f1\fs18 calcFST()
19831998
\f2\fs20 , viewable with
@@ -1998,7 +2013,7 @@ The implementation of
19982013
\f1\fs18 \cf2 (float$)calcHeterozygosity(object<Haplosome>\'a0haplosomes, [No<Mutation>\'a0muts\'a0=\'a0NULL], [Ni$\'a0start\'a0=\'a0NULL], [Ni$\'a0end\'a0=\'a0NULL])\
19992014
\pard\pardeftab397\li547\ri720\sb60\sa60\partightenfactor0
20002015

2001-
\f2\fs20 \cf2 Calculates the heterozygosity for a vector of haplosomes, based upon the frequencies of mutations in the haplosomes. The result is the
2016+
\f2\fs20 \cf2 Calculates the heterozygosity for a vector of haplosomes (containing at least one element), based upon the frequencies of mutations in the haplosomes. The result is the
20022017
\f3\i expected
20032018
\f2\i0 heterozygosity, for the individuals to which the haplosomes belong, assuming that they are under Hardy-Weinberg equilibrium; this can be compared to the
20042019
\f3\i observed
@@ -2040,16 +2055,20 @@ The implementation of
20402055
\f2\fs20 for further discussion.\
20412056
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
20422057

2043-
\f1\fs18 \cf2 (float$)calcInbreedingLoad(object<Haplosome>\'a0haplosomes, [No<MutationType>$\'a0mutType\'a0=\'a0NULL])\
2058+
\f1\fs18 \cf2 (float$)calcInbreedingLoad(object<Haplosome>\'a0haplosomes, [Nio<MutationType>$\'a0mutType\'a0=\'a0NULL])\
20442059
\pard\pardeftab397\li547\ri720\sb60\sa60\partightenfactor0
20452060

20462061
\f2\fs20 \cf2 Calculates inbreeding load (the haploid number of lethal equivalents, or
20472062
\f3\i B
2048-
\f2\i0 ) for a vector of haplosomes passed in
2063+
\f2\i0 ) for a vector of haplosomes (containing at least one element) passed in
20492064
\f1\fs18 haplosomes
20502065
\f2\fs20 . The calculation can be limited to a focal mutation type passed in
20512066
\f1\fs18 mutType
2052-
\f2\fs20 ; if
2067+
\f2\fs20 (which may be either an
2068+
\f1\fs18 integer
2069+
\f2\fs20 representing the ID of the desired mutation type, or a
2070+
\f1\fs18 MutationType
2071+
\f2\fs20 object specified directly); if
20532072
\f1\fs18 mutType
20542073
\f2\fs20 is
20552074
\f1\fs18 NULL
@@ -2079,7 +2098,9 @@ The inbreeding load is a measure of the quantity of recessive deleterious variat
20792098
\f3\i s
20802099
\f2\i0 is the absolute value of the selection coefficient, and
20812100
\f3\i h
2082-
\f2\i0 is its dominance coefficient. Note that the implementation sets a maximum |
2101+
\f2\i0 is its dominance coefficient. Note that the implementation, viewable with
2102+
\f1\fs18 functionSource()
2103+
\f2\fs20 , sets a maximum |
20832104
\f3\i s
20842105
\f2\i0 | of
20852106
\f1\fs18 1.0
@@ -2150,7 +2171,7 @@ The implementation
21502171

21512172
\f2\fs20 \cf2 Calculates
21522173
\f7\i \uc0\u960
2153-
\f2\i0 (a metric of genetic diversity based on pairwise sequence differences) for a vector of haplosomes, based upon the mutations in the haplosomes. The mathematical formulation (as an estimator of the population parameter
2174+
\f2\i0 (a metric of genetic diversity based on pairwise sequence differences) for a vector of haplosomes (containing at least two elements), based upon the mutations in the haplosomes. The mathematical formulation (as an estimator of the population parameter
21542175
\f7\i \uc0\u952
21552176
\f2\i0 ) is based on work in Nei and Li (1979), Nei and Tajima (1981), and Tajima (1983; equation A3). The exact formula used here is common in textbooks (e.g., equation 3.3 in Hahn 2018, or equation 2.2 in Coop 2020). This value is averaged by the number of sites.\
21562177
Often
@@ -2191,7 +2212,7 @@ The implementation of
21912212

21922213
\f2\fs20 \cf2 Calculates Tajima\'92s
21932214
\f3\i D
2194-
\f2\i0 (a test of neutrality based on the allele frequency spectrum) for a vector of haplosomes, based upon the mutations in the haplosomes. The mathematical formulation is given in Tajima 1989 (equation 38) and remains unchanged (e.g., equations 2.30 in Durrett 2008, 8.4 in Hahn 2018, and 4.44 in Coop 2020). Often
2215+
\f2\i0 (a test of neutrality based on the allele frequency spectrum) for a vector of haplosomes (containing at least four elements), based upon the mutations in the haplosomes. The mathematical formulation is given in Tajima 1989 (equation 38) and remains unchanged (e.g., equations 2.30 in Durrett 2008, 8.4 in Hahn 2018, and 4.44 in Coop 2020). Often
21952216
\f1\fs18 haplosomes
21962217
\f2\fs20 will be all of the haplosomes in a subpopulation, or in the entire population, but any haplosome vector may be used. By default, with
21972218
\f1\fs18 muts=NULL
@@ -2211,6 +2232,13 @@ The calculation can be narrowed to apply to only a window \'96 a subrange of the
22112232
\f2\fs20 , provides the haplosome-wide Tajima\'92s
22122233
\f3\i D
22132234
\f2\i0 .\
2235+
If the genetic diversity contained within the haplosomes is insufficient for the calculation,
2236+
\f1\fs18 calcTajimasD()
2237+
\f2\fs20 may return
2238+
\f1\fs18 NAN
2239+
\f2\fs20 . It is up to the caller to detect this with
2240+
\f1\fs18 isNAN()
2241+
\f2\fs20 and handle it as necessary.\
22142242
The implementation of
22152243
\f1\fs18 calcTajimasD()
22162244
\f2\fs20 , viewable with
@@ -2228,45 +2256,13 @@ The implementation of
22282256
\f2\fs20 for further discussion. This function was written by Nick Bailey (currently affiliated with CNRS and the Laboratory of Biometry and Evolutionary Biology at University Lyon 1), with helpful input from Peter Ralph.\
22292257
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
22302258

2231-
\f1\fs18 \cf2 (float$)calcWattersonsTheta(object<Haplosome>\'a0haplosomes, [No<Mutation>\'a0muts\'a0=\'a0NULL], [Ni$\'a0start\'a0=\'a0NULL], [Ni$\'a0end\'a0=\'a0NULL])\
2232-
\pard\pardeftab397\li547\ri720\sb60\sa60\partightenfactor0
2233-
2234-
\f2\fs20 \cf2 Calculates Watterson\'92s theta (a metric of genetic diversity comparable to heterozygosity) for a vector of haplosomes, based upon the mutations in the haplosomes. Often
2235-
\f1\fs18 haplosomes
2236-
\f2\fs20 will be all of the haplosomes in a subpopulation, or in the entire population, but any haplosome vector may be used. By default, with
2237-
\f1\fs18 muts=NULL
2238-
\f2\fs20 , the calculation is based upon all mutations in the simulation; the calculation can instead be based upon a subset of mutations, such as mutations of a specific mutation type, by passing the desired vector of mutations for
2239-
\f1\fs18 muts
2240-
\f2\fs20 .\
2241-
The calculation can be narrowed to apply to only a window \'96 a subrange of the full chromosome \'96 by passing the interval bounds [
2242-
\f1\fs18 start
2243-
\f2\fs20 ,
2244-
\f1\fs18 end
2245-
\f2\fs20 ] for the desired window. In this case, the vector of mutations used for the calculation will be subset to include only mutations within the specified window. The default behavior, with
2246-
\f1\fs18 start
2247-
\f2\fs20 and
2248-
\f1\fs18 end
2249-
\f2\fs20 of
2250-
\f1\fs18 NULL
2251-
\f2\fs20 , provides the haplosome-wide Watterson\'92s theta.\
2252-
The implementation of
2253-
\f1\fs18 calcWattersonsTheta()
2254-
\f2\fs20 , viewable with
2255-
\f1\fs18 functionSource()
2256-
\f2\fs20 , treats every mutation as independent in the heterozygosity calculations. One could regard this choice as embodying an infinite-sites interpretation of the segregating mutations, as with
2257-
\f1\fs18 calcHeterozygosity()
2258-
\f2\fs20 . In most biologically realistic models, such genetic states will be quite rare, and so the impact of this assumption will be negligible; however, in some models this distinction may be important. See
2259-
\f1\fs18 calcPairHeterozygosity()
2260-
\f2\fs20 for further discussion.\
2261-
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
2262-
22632259
\f1\fs18 \cf2 (float$)calcVA(object<Individual>\'a0individuals, io<MutationType>$\'a0mutType)\
22642260
\pard\pardeftab397\li547\ri720\sb60\sa60\partightenfactor0
22652261

22662262
\f2\fs20 \cf2 Calculates
22672263
\f3\i V
22682264
\f2\i0\fs13\fsmilli6667 \sub A
2269-
\fs20 \nosupersub , the additive genetic variance, among a vector
2265+
\fs20 \nosupersub , the additive genetic variance, among a vector of individuals (containing at least two elements) passed in
22702266
\f1\fs18 individuals
22712267
\f2\fs20 , in a particular mutation type
22722268
\f1\fs18 mutType
@@ -2290,6 +2286,38 @@ This function assumes that mutations of type
22902286
\f2\fs20 ), a new user-defined function following the pattern of
22912287
\f1\fs18 calcVA()
22922288
\f2\fs20 can easily be written.\
2289+
\pard\pardeftab720\li720\fi-446\ri720\sb180\sa60\partightenfactor0
2290+
2291+
\f1\fs18 \cf2 (float$)calcWattersonsTheta(object<Haplosome>\'a0haplosomes, [No<Mutation>\'a0muts\'a0=\'a0NULL], [Ni$\'a0start\'a0=\'a0NULL], [Ni$\'a0end\'a0=\'a0NULL])\
2292+
\pard\pardeftab397\li547\ri720\sb60\sa60\partightenfactor0
2293+
2294+
\f2\fs20 \cf2 Calculates Watterson\'92s theta (a metric of genetic diversity comparable to heterozygosity) for a vector of haplosomes (containing at least one element), based upon the mutations in the haplosomes. Often
2295+
\f1\fs18 haplosomes
2296+
\f2\fs20 will be all of the haplosomes in a subpopulation, or in the entire population, but any haplosome vector may be used. By default, with
2297+
\f1\fs18 muts=NULL
2298+
\f2\fs20 , the calculation is based upon all mutations in the simulation; the calculation can instead be based upon a subset of mutations, such as mutations of a specific mutation type, by passing the desired vector of mutations for
2299+
\f1\fs18 muts
2300+
\f2\fs20 .\
2301+
The calculation can be narrowed to apply to only a window \'96 a subrange of the full chromosome \'96 by passing the interval bounds [
2302+
\f1\fs18 start
2303+
\f2\fs20 ,
2304+
\f1\fs18 end
2305+
\f2\fs20 ] for the desired window. In this case, the vector of mutations used for the calculation will be subset to include only mutations within the specified window. The default behavior, with
2306+
\f1\fs18 start
2307+
\f2\fs20 and
2308+
\f1\fs18 end
2309+
\f2\fs20 of
2310+
\f1\fs18 NULL
2311+
\f2\fs20 , provides the haplosome-wide Watterson\'92s theta.\
2312+
The implementation of
2313+
\f1\fs18 calcWattersonsTheta()
2314+
\f2\fs20 , viewable with
2315+
\f1\fs18 functionSource()
2316+
\f2\fs20 , treats every mutation as independent in the heterozygosity calculations. One could regard this choice as embodying an infinite-sites interpretation of the segregating mutations, as with
2317+
\f1\fs18 calcHeterozygosity()
2318+
\f2\fs20 . In most biologically realistic models, such genetic states will be quite rare, and so the impact of this assumption will be negligible; however, in some models this distinction may be important. See
2319+
\f1\fs18 calcPairHeterozygosity()
2320+
\f2\fs20 for further discussion.\
22932321
\pard\pardeftab397\ri720\sb360\sa60\partightenfactor0
22942322

22952323
\f0\b\fs22 \cf0 3.4. Other utilities

VERSIONS

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,11 +104,11 @@ development head (in the master branch):
104104
fix a bug in treeSeqMetadata(); if no metadata was found, it would return object<Dictionary>(0), not an empty Dictionary object (and it leaked)
105105
shift to returning a Chromosome object from initializeChromosome() and allowing the user to retain it with defineConstant()
106106
renamed recipe 15.10 to "Recording a pedigree and following a pedigree in a nonWF model" so it is clearer what it does
107-
update calcHeterozygosity() for multi-chromosome support (but it assesses heterozygosity for only a single chromosome, for now)
108107
new PAR (pseudo-autosomal region) recipe for section 15.23, replacing the old recipe in section 14.5 (which will be moved to SLiM-Extras for posterity)
109108
new recipe 9.11 (Ne estimation) with a new multichrom-aware version of function estimateNe_Heterozygosity() that requires the chromosome to be supplied
110109
recombination() callbacks can now be optionally declared with a chromosome specifier (id or symbol), to apply to just one chromosome
111110
registerRecombinationCallback() now takes an optional chromosome parameter to specify that focal chromosome
111+
fix up the calc...() functions for multiple chromosomes, and to be robust to the case of zero mutations (existing at all, or within the supplied haplosomes), and to improve error-checking and documentation
112112

113113

114114
version 4.3 (Eidos version 3.3):

core/community_eidos.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -792,7 +792,7 @@ EidosValue_SP Community::ExecuteMethod_mutationTypesWithIDs(EidosGlobalStringID
792792
MutationType *object = MutationTypeWithID(id);
793793

794794
if (!object)
795-
EIDOS_TERMINATION << "ERROR (Community::ExecuteMethod_mutationTypesWithIDs): mutationTypesWithIDs() did not find a genomic element type with id " << id << "." << EidosTerminate();
795+
EIDOS_TERMINATION << "ERROR (Community::ExecuteMethod_mutationTypesWithIDs): mutationTypesWithIDs() did not find a mutation type with id " << id << "." << EidosTerminate();
796796

797797
vec->set_object_element_no_check_NORR(object, id_index);
798798
}

0 commit comments

Comments
 (0)