Skip to content

Commit 25a26a7

Browse files
authored
Mark migrations as legacy, and clarify backward-time definition (#3348)
Also fixes #1157
1 parent 28c9e48 commit 25a26a7

File tree

1 file changed

+34
-8
lines changed

1 file changed

+34
-8
lines changed

docs/data-model.md

Lines changed: 34 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -295,15 +295,36 @@ required for a valid set of mutations.
295295

296296
#### Migration Table
297297

298+
:::{note}
299+
Encoding migration in the migrations table is a legacy approach
300+
associated with older versions of `msprime`; recording movement between
301+
populations in the migration table is entirely optional, even when related
302+
nodes are assigned to different populations.
303+
:::
304+
305+
:::{warning}
306+
The migration table may be entirely removed from the `tskit` data model
307+
in the future. Meanwhile, a number of `tskit` functions, such as
308+
{meth}`~TreeSequence.simplify()` will raise an error if data exists in
309+
the migrations table.
310+
:::
311+
312+
:::{seealso}
313+
The {ref}`msprime:sec_ancestry_record_migrations`
314+
sections and the associated discussion of
315+
{ref}`msprime:sec_demography_migration` in the `msprime` documentation.
316+
:::
317+
298318
In simulations, trees can be thought of as spread across space, and it is
299319
helpful for inferring demographic history to record this history.
300-
Migrations are performed by individual ancestors, but most likely not by an
320+
Migrations are performed by individual ancestors, but might not be tagged by an
301321
individual whose genome is tracked as a `node` (as in a discrete-deme model they are
302322
unlikely to be both a migrant and a most recent common ancestor). So,
303-
`tskit` records when a segment of ancestry has moved between
323+
`tskit` can record separately when a segment of ancestry has moved between
304324
populations. This table is not required, even if different nodes come from
305325
different populations.
306326

327+
307328
| Column | Type | Description |
308329
| :--------- | -------- | -----------------------------------------------------: |
309330
| left | double | Left coordinate of the migrating segment (inclusive). |
@@ -316,18 +337,23 @@ different populations.
316337

317338

318339
The `left` and `right` columns are floating point values defining the
319-
half-open segment of genome affected. The `source` and `dest` columns
320-
record the IDs of the respective populations. The `node` column records the
321-
ID of the node that was associated with the ancestry segment in question
322-
at the time of the migration event. The `time` column is holds floating
323-
point values recording the time of the event.
340+
half-open segment of genome affected (these need not exactly correspond to
341+
breakpoints between edges). The `source` and `dest` columns record the IDs of
342+
the respective populations (note that by `msprime` convention, "source" and
343+
"destination" are defined in reverse time, see
344+
{ref}`msprime:sec_demography_direction_of_time`.). The `time` column
345+
holds floating point values recording the time of the event, with migrations
346+
assumed to occur instantaneously. The `node` column records the ID of the child
347+
node of the migrating segment; in consequence the population ID of the `node` will
348+
match the `src` ID (unless sequential migrations affect the same `node`, in which
349+
case it will match the `src` value of the youngest of those migrations).
324350

325351
The `metadata` column provides a location for client code to store
326352
information about each migration. See the {ref}`sec_metadata_definition` section for
327353
more details on how metadata columns should be used.
328354

329355
See the {ref}`sec_migration_requirements` section for details on the properties
330-
required for a valid set of mutations.
356+
required for a valid set of migrations.
331357

332358

333359
(sec_population_table_definition)=

0 commit comments

Comments
 (0)