Skip to content

Conversation

@petrelharp
Copy link
Contributor

@petrelharp petrelharp commented Jan 21, 2026

This is just (so far) taking the code from @jeffspence's master branch that he and @roshnipatel have got together and making a PR so we can write out a plan and discuss things.

Proposed rough API:

# set up traits
traits = stdpopsim.traits.Trait(
    [(name, link_function), (name, link_function), ... ],
    default_environemntal_sigma = 0,  # if no environments are added, they are all Gaussian with this sigma
    default_fitness_sigma = Inf, # similarly, default Gaussian
)
traits.add_environment( ... ) # optionally
traits.add_fitness_function(...) # optionally
# maybe add more of each

# set up "effect size distrn" which is like DFE:
esd = stdpopsim.traits.effect_size_distribution(  # maybe? they will be (mostly?) generic
   mutation_types = [
     (multivariate_mutation_type, traits, proportion) # list of tuples of this form
   ]
)
contig.add_effect_size_distribution(esd, ...) # like add_dfe

# use them
engine.simulate(contig, traits)

Note: we could alternatively instantiate Environment and FitnessFunctions to pass into the Trait constructor,
but these have to refer to "which traits do I work with", so it's cleaner to have the traits set up
before we make them.

Questions:

  • It'd be nice for traits to have names, maybe? So perhaps the first argument to make_traits( ) should be a dictionary, whose keys are the names of traits and values are the link functions?

Notes:

  • terminology: "trait" is the big-picture collection of thigns; "phenotype" is the actual measured value (or maybe "trait value"?)
  • we can do "direct multiplicative effects on fitness" by doing an additive trait with fitness function exp( )

Traits:

Has

  • a list of, uh, traits, each of which have a link function
  • a list of Environments (each of which can affect a collection of the traits)
  • a list of FitnessFunctions (each of which operate on a collection of the traits)

EffectSizeDistribution

Maps mutations to traits, so has

  • a list of MultivariateMutationTypes
  • and corresponding "which traits does this apply to" and "what proportion of mutations"

MultivariateMutationType

Is a (biological?) class of mutations, and so:

  • any direct fitness effects: effect size distrn and dominance
  • effects on trait(s):
    • (multivariate) distribution type, and arguments to this
    • 'scaling' distribution type, and arguments (that multiplies that)

FitnessFunction

Possibly not public?

Has:

  • trait type: quantitative, binary, fitness, or neutral
    • this is just basically "which fitness function"
  • when and where this applies

Environment

This affects how the genetic value maps to phenotype,
including the part that changes with time/space.
Environments can affect more than one trait (so you have correlated noise).
Possibly not public?

Adds "noise" to the genetic value. Has:

  • distribution and parameters
  • when and where this applies

Link Function

This produces the observed biological phenotype:
for instance, describes how disease liability translates to disease incidence,
so it should not change with time or location.
Link functions map only a single trait, independently of others.

GeneticValueTransform? PhenotypeTransform? Well, this applies to "genetic value plus environment" so it's not a "genetic value transform"? Something else?

This transforms the underlying (genetic value plus environemtn) to the "observed trait". Has:

  • a function
  • maybe some randomness (but NOT what could be put in with environment)

Examples:

  • identity
  • threshold
  • liability: trait equals 1 with probability logit(value)

List of standard use cases:

We should make it very easy to produce one of each of these,
something like stdpopsim.traits.StabilizingSelectionOnTrait(n, sigma)

Stabilizing selection on quantiative trait(s)

Directional selection on a quantitative trait

Truncation selection on a binary trait

Direct effects on fitness (previous behavior)

Neutral

@gregorgorjanc
Copy link
Contributor

gregorgorjanc commented Jan 21, 2026

WIP ...

Some thoughts on the lower part of #1792 (comment), as in these are some standard use cases and an attempt to produce each of these with ease.

While thinking about these I struck me that the TraitsModel specifies properties of traits, while these "use cases" below are talking about how selection operates on these, which is another "thing". Do we need to think how to encode Selection? When thinking about Fitness this comes up naturally, but we might also want to do selection on non-fitness traits. Will the Fitness component of the TraitsModel capture all of these different types of selection on all the traits that the user would like to select on?

Stabilizing selection

stdpopsim.traits.StabilizingSelection(trait, sigma)
# trait: trait name or index
# sigma: scale of the Gaussian fitness function
* this will call fitness function
* what else?

Directional selection

stdpopsim.traits.DirectionalSelection(trait, ???, direction = "high", )
# trait: trait name or index
# ??? TODO: Peter mentioned exp(x) type thing so we increase probability of selection for individuals with high/low trait - do we need a parameter beta to do smth like exp(\beta x) 
# direction: "high" or "low"

TODO: Discuss how to handle multiple traits here.
We could add a linear combination argument (could also be non-linear function!?),
possibly trait name/index could take in a function too?
This function would combine multiple traits into a single score, then do the exp(x)

Truncation selection

stdpopsim.traits.TruncationSelection(trait, proportion_selected, direction = "high")
# trait: trait name or index
# propotion_selected: proportion of individuals selected each generation [0, 1]
# direction: "high" or "low"

TODO: Discuss how to handle multiple traits here - the above assumes a single trait.
We can not have proportion_selected across multiple traits.
We could add a linear combination argument (could also be non-linear function!?),
possibly trait name/index could take in a function too?
This function would combine multiple traits into a single score,
then we take the top/bottom prop of individuals based on the score.

Truncation selection (on a binary trait)

The same as directional selection above, but applied to 0/1 scores?
Say, we have proportion_selected and if Pr(1) is lower than that, we select all 1s,
and then fill the rest with 0s picked at random?

Direct effects on fitness (previous behavior)

TODO: This is just a special case of directional selection on a trait,
but the trait is fitness. Where is it stored? How to refer to it?

Neutral

Do nothing?

@petrelharp
Copy link
Contributor Author

Outline:
board1
board2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants