Skip to content

Commit 05fb2f7

Browse files
committed
minor doc fixes
warn and remove existing metadata in annotatino; closes #327
1 parent 0e24031 commit 05fb2f7

File tree

4 files changed

+70
-21
lines changed

4 files changed

+70
-21
lines changed

CHANGELOG.rst

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ https://tskit.dev/pyslim/docs/latest/previous_versions.html
1414
- SLiM tree sequence file version number has changed to 0.9.
1515

1616
- This is a change in SLiM, really, but top-level SLiM metadata now requires
17-
a `"this_chromosome"` entry.
17+
a `"this_chromosome"` entry.
1818

1919
- Similarly, node metadata no longer has `genome_type` or `is_null`; instead
2020
they have `is_vacant`, and the chromosome type is in top-level metadata,
@@ -25,7 +25,13 @@ https://tskit.dev/pyslim/docs/latest/previous_versions.html
2525
wishing to set up a multi-chromosome simulation should use instead
2626
`pyslim.slim_node_metadata_schema` (which has the appropriate value in
2727
`["properties"]["is_vacant"]["length"]`)
28-
(:user:`petrelharp`, :pr:`367`)
28+
(:user:`petrelharp`, :pr:`367`).
29+
30+
- Previously, `pyslim.annotate` would leave existing node and individual
31+
metadata, even if this metadata came from a different schema. This
32+
could silently create garbage metadata. Now, `annotate` removes
33+
any existing metadata, and warns if this occurs
34+
(:user:`petrelharp`, :pr:`390`).
2935

3036
**Notable changes**:
3137

@@ -38,7 +44,7 @@ https://tskit.dev/pyslim/docs/latest/previous_versions.html
3844

3945
- `pyslim.set_slim_state` will adjust times and "alive" flags so
4046
that when the tree sequence is loaded into SLiM it will have
41-
a specified set of individuals alive.
47+
a specified set of individuals alive at a particular time.
4248
(:user:`petrelharp`, :pr:`384`)
4349

4450
- Functions `pyslim.node_is_vacant` and `pyslim.has_vacant_samples`
@@ -61,27 +67,27 @@ https://tskit.dev/pyslim/docs/latest/previous_versions.html
6167

6268
**Bugfixes**:
6369

64-
- The individual flags `INDIVIDUAL_ALIVE`, `INDIVIDUAL_REMEMBERED`,
65-
and `INDIVIDUAL_RETAINED` were signed integers, but the flags in the
66-
individual table they apply to are unsigned, so using the
67-
bitwise negation operator `~` could result in an error. Now,
68-
they are np.uint32 values. (:user:`petrelharp`, :pr:`378`)
69-
7070
- Recapitation on tree sequences with null genomes would attempt to simulate
7171
the history of those null genomes; this would in all but exceptional cases
7272
fail with an error ("not all roots are at the time expected"). Now, null
7373
genomes are "vacant" (see above) and `recapitate` removes their
7474
sample flags before recapitating (and optionally puts them back)
75-
as described above (:user:`petrelharp`, :pr:`367`)
76-
75+
as described in `pyslim.remove_vacant` (:user:`petrelharp`, :pr:`367`).
76+
7777
- Previously, recapitation would require the roots of all trees to be
7878
at the same time (roughly) as the 'tick' stored in the top-level metadata;
7979
however, this would not be the case if the first population was added
80-
later than the first tick. (:user:`petrelharp`, :pr:`382`)
80+
later than the first tick. The requirement has therefore been removed.
81+
(:user:`petrelharp`, :pr:`382`)
8182

8283
- The `generated_nucleotides` method now sets the `nucleotide_based` entry
83-
in top-level metadata to True.
84-
(:user:`petrelharp`, :pr:`385`)
84+
in top-level metadata to True. (:user:`petrelharp`, :pr:`385`)
85+
86+
- The individual flags `INDIVIDUAL_ALIVE`, `INDIVIDUAL_REMEMBERED`,
87+
and `INDIVIDUAL_RETAINED` were signed integers, but the flags in the
88+
individual table they apply to are unsigned, so using the
89+
bitwise negation operator `~` could result in an error. Now,
90+
they are np.uint32 values. (:user:`petrelharp`, :pr:`378`)
8591

8692

8793
***************************

docs/tutorial.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,8 @@ align: right
258258
name: pedigree_simplify
259259
---
260260
The result of simplifying the tree sequence
261-
in figure {numref}`figure {number} <pedigree_recapitate>`.
261+
in figure {numref}`figure {number} <pedigree_recapitate>`
262+
to only two of the three samples.
262263
```
263264

264265
Probably, your simulations have produced many more fictitious genomes

pyslim/methods.py

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1009,6 +1009,16 @@ def _annotate_nodes_individuals(tables, age):
10091009
If you have other situations, like non-alive "remembered" individuals, you
10101010
will need to edit the tables by hand, afterwards.
10111011
'''
1012+
if len(tables.nodes.metadata) > 0:
1013+
warnings.warn(
1014+
"The provided tree sequence already has some nodes with "
1015+
"metadata; this metadata will be overwritten."
1016+
)
1017+
if len(tables.individuals.metadata) > 0:
1018+
warnings.warn(
1019+
"The provided tree sequence already has some individuals with "
1020+
"metadata; this metadata will be overwritten."
1021+
)
10121022
ind_population = np.full(tables.individuals.num_rows, -1, dtype="int")
10131023
ind_slim_id = np.full(tables.individuals.num_rows, 0, dtype='int')
10141024
nid = 0
@@ -1025,12 +1035,8 @@ def _annotate_nodes_individuals(tables, age):
10251035
md["slim_id"] = nid
10261036
nid += 1
10271037
else:
1028-
md = n.metadata
1038+
md = None
10291039
node_metadata.append(md)
1030-
print("....")
1031-
print(md)
1032-
print(tables.nodes.metadata_schema.validate_and_encode_row(md))
1033-
10341040
nms = tables.nodes.metadata_schema
10351041
tables.nodes.packset_metadata([
10361042
nms.validate_and_encode_row(x)
@@ -1053,7 +1059,7 @@ def _annotate_nodes_individuals(tables, age):
10531059
# so no big deal
10541060
ind_flags[j] |= INDIVIDUAL_ALIVE
10551061
else:
1056-
md = ind.metadata
1062+
md = None
10571063
ind_metadata.append(md)
10581064
tables.individuals.set_columns(
10591065
flags=ind_flags,

tests/test_annotation.py

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -263,6 +263,42 @@ def test_warns_overwriting_mutations(self, helper_functions):
263263
with pytest.warns(Warning, match="already has.*metadata"):
264264
slim_ts = pyslim.annotate(ts, model_type="WF", tick=1)
265265

266+
def test_warns_and_overwrites_node_metadata(self, helper_functions):
267+
ts = msprime.sim_ancestry(
268+
4,
269+
population_size=10,
270+
sequence_length=10,
271+
recombination_rate=0.01,
272+
random_seed=100,
273+
)
274+
tables = ts.dump_tables()
275+
tables.nodes.clear()
276+
for n in ts.nodes():
277+
tables.nodes.append(n.replace(metadata=b'abc'))
278+
ts = tables.tree_sequence()
279+
with pytest.warns(Warning, match="already has.*metadata"):
280+
slim_ts = pyslim.annotate(ts, model_type="WF", tick=1)
281+
for n in slim_ts.nodes():
282+
assert n.is_sample() or n.metadata is None
283+
284+
def test_warns_and_overwrites_individual_metadata(self, helper_functions):
285+
ts = msprime.sim_ancestry(
286+
4,
287+
population_size=10,
288+
sequence_length=10,
289+
recombination_rate=0.01,
290+
random_seed=100,
291+
)
292+
tables = ts.dump_tables()
293+
tables.individuals.clear()
294+
for n in ts.individuals():
295+
tables.individuals.append(n.replace(metadata=b'abc'))
296+
ts = tables.tree_sequence()
297+
with pytest.warns(Warning, match="already has.*metadata"):
298+
slim_ts = pyslim.annotate(ts, model_type="WF", tick=1)
299+
for ind in slim_ts.individuals():
300+
assert slim_ts.node(ind.nodes[0]).is_sample() or ind.metadata is None
301+
266302
def test_just_simulate(self, helper_functions, tmp_path):
267303
ts = msprime.sim_ancestry(
268304
4,

0 commit comments

Comments
 (0)