|
| 1 | +--- |
| 2 | +jupytext: |
| 3 | + text_representation: |
| 4 | + extension: .md |
| 5 | + format_name: myst |
| 6 | + format_version: 0.12 |
| 7 | + jupytext_version: 1.9.1 |
| 8 | +kernelspec: |
| 9 | + display_name: Python 3 |
| 10 | + language: python |
| 11 | + name: python3 |
| 12 | +--- |
| 13 | + |
| 14 | +```{code-cell} |
| 15 | +:tags: [remove-cell] |
| 16 | +import pyslim, tskit, msprime |
| 17 | +
|
| 18 | +ts = tskit.load("example_sim.trees") |
| 19 | +tables = ts.tables |
| 20 | +``` |
| 21 | + |
| 22 | + |
| 23 | +(sec_previous_versions)= |
| 24 | + |
| 25 | + |
| 26 | +# Migrating from previous versions of pyslim |
| 27 | + |
| 28 | +A number of features that were first introduced in pyslim have been made part of core |
| 29 | +tskit functionality. For instance, reference sequence support was provided (although |
| 30 | +loosely) inpyslim to support SLiM's nucleotide models, but is now part of a standard |
| 31 | +tskit {class}`tskit.TreeSequence`. Similarly, metadata processing in tskit made |
| 32 | +code to do this within pyslim obsolete; this "legacy metadata" code has been removed |
| 33 | +and instructions for how to migrate your code are {ref}`below <sec_legacy_metadata>`. |
| 34 | + |
| 35 | +In fact, we are now at the (very good) place where we don't really need |
| 36 | +the {class}`pyslim.SlimTreeSequence` class any longer, |
| 37 | +and it will soon be deprecated. |
| 38 | +So, pyslim is migrating to be purely functional: instead of providing the SlimTreeSequence |
| 39 | +class with specialized methods, all methods will be functions of TreeSequences, |
| 40 | +that take in a tree sequence and return something |
| 41 | +(a modified tree sequence or some summary of it). |
| 42 | +Backwards compatibility will be maintained for some time, but we request that you |
| 43 | +switch over sooner, as your code will be cleaner and faster. |
| 44 | + |
| 45 | +To migrate, you should: |
| 46 | + |
| 47 | + |
| 48 | +1. Replace `ts.slim_generation` with `ts.metadata['SLiM']['generation']`, |
| 49 | + and `ts.model_type` with `ts.metadata['SLiM']['model_type']`. |
| 50 | +2. Replace `ts.reference_sequence` with `ts.reference_sequence.data`. |
| 51 | +3. Replace calls to `ts.recapitate(...)` with `pyslim.recapitate(ts, ...)`, |
| 52 | + and similarly with other SlimTreeSequence methods. |
| 53 | + |
| 54 | +If you encounter difficulties, please post an |
| 55 | +[issue](https://github.com/tskit-dev/pyslim/issues) |
| 56 | +or [discussion](https://github.com/tskit-dev/pyslim/discussions) on github. |
| 57 | + |
| 58 | + |
| 59 | +(sec_legacy_metadata)= |
| 60 | + |
| 61 | +## Legacy metadata |
| 62 | + |
| 63 | +In previous versions of pyslim, |
| 64 | +SLiM-specific metadata was provided as customized objects: |
| 65 | +for instance, for a node ``n`` provided by a ``SlimTreeSequence``, |
| 66 | +we'd have ``n.metadata`` as a ``NodeMetadata`` object, |
| 67 | +with attributes ``n.metadata.slim_id`` and ``n.metadata.is_null`` and ``n.metadata.genome_type``. |
| 68 | +However, with tskit 0.3, |
| 69 | +the capacity to deal with structured metadata |
| 70 | +was implemented in {ref}`tskit itself <tskit:sec_metadata>`, |
| 71 | +and so pyslim shifted to using the tskit-native metadata tools. |
| 72 | +As a result, parsed metadata is provided as a dictionary instead of an object, |
| 73 | +so that now ``n.metadata`` would be a dict, |
| 74 | +with entries ``n.metadata["slim_id"]`` and ``n.metadata["is_null"]`` and ``n.metadata["genome_type"]``. |
| 75 | +Annotation should be done with tskit methods (e.g., ``packset_metadata``). |
| 76 | + |
| 77 | +.. note:: |
| 78 | + |
| 79 | + Until pyslim version 0.600, the old-style metadata was still available, |
| 80 | + but this functionality has been removed. |
| 81 | + |
| 82 | +Here are more detailed notes on how to migrate a script from the legacy |
| 83 | +metadata handling. If you run into issues, please ask (open a discussion on github). |
| 84 | + |
| 85 | +**1.** Use top-level metadata instead of ``slim_provenance``: |
| 86 | +previously, information about the model type and the time counter (generation) |
| 87 | +in SLiM was provided in the Provenances table, made available through |
| 88 | +the ``ts.slim_provenance`` object. This is still available but deprecated, |
| 89 | +and should be obtained from the *top-level* metadata object, ``ts.metadata["SLiM"]``. |
| 90 | +So, in your scripts ``ts.slim_provenance.model_type`` should be replaced with |
| 91 | +``ts.metadata["SLiM"]["model_type"]``, |
| 92 | +and (although it's not deprecated), probably ``ts.slim_generation`` should |
| 93 | +probably be replaced with |
| 94 | +``ts.metadata["SLiM"]["generation"]``. |
| 95 | + |
| 96 | +**2.** Switch metadata objects to dicts: |
| 97 | +if ``md`` is the ``metadata`` property of a population, individual, or node, |
| 98 | +this means replacing ``md.X`` with ``md["X"]``. |
| 99 | +The ``migration_records`` property of population metadata is similarly |
| 100 | +a list of dicts rather than a list of objects, so instead of |
| 101 | +``ts.population(1).metadata.migration_records[0].source_subpop`` |
| 102 | +we would write |
| 103 | +``ts.population(1).metadata["migration_records"][0]["source_subpop"]``. |
| 104 | + |
| 105 | +Mutations were previously a bit different - if ``mut`` is a mutation |
| 106 | +(e.g., ``mut = ts.mutation(0)``) |
| 107 | +then ``mut.metadata`` was previously a list of MutationMetadata objects. |
| 108 | +Now, ``mut.metadata`` is a dict, with a single entry: |
| 109 | +``mut.metadata["mutation_list"]`` is a list of dicts, each containing the information |
| 110 | +that was previously in the MutationMetadata objects. |
| 111 | +So, for instance, instead of ``mut.metadata[0].selection_coeff`` |
| 112 | +we would write ``mut.metadata["mutation_list"][0]["selection_coeff"]``. |
| 113 | + |
| 114 | +**3.** The ``decode_X`` and ``encode_X`` methods are now deprecated, |
| 115 | +as this is handled by tskit itself. |
| 116 | +For instance, ``encode_node`` would take a NodeMetadata object |
| 117 | +and produce the raw bytes necessary to encode it in a Node table, |
| 118 | +and ``decode_node`` would do the inverse operation. |
| 119 | +This is now handled by the relevant MetadataSchema object: |
| 120 | +for nodes one can obtain this as ``nms = ts.tables.nodes.metadata_schema``, |
| 121 | +which has the methods ``nms.validate_and_encode_row`` and ``nms.decode_row``. |
| 122 | +Decoding is for the most part not necessary, |
| 123 | +since the metadata is automatically decoded, |
| 124 | +but ``pyslim.decode_node(raw_md)`` could be replaced by ``nms.decode_row(raw_md)``. |
| 125 | +Encoding is necessary to modify tables, |
| 126 | +and ``pyslim.encode_node(md)`` can be replaced by ``nms.validate_and_encode_row(md)`` |
| 127 | +(where furthermore ``md`` should now be a dict rather than a NodeMetadata object). |
| 128 | + |
| 129 | +**4.** The ``annotate_X_metadata`` methods are deprecated, |
| 130 | +as again tskit has tools to do this. |
| 131 | +These methods would set the metadata column of a table - |
| 132 | +for instance, if ``metadata`` is a list of NodeMetadata objects, then |
| 133 | +``annotate_node_metadata(tables, metadata)`` would modify ``tables.nodes`` in place |
| 134 | +to contain the (encoded) metadata in the list ``metadata``. |
| 135 | +Now, this could be done as follows (where now ``metadata`` is a list of metadata dicts): |
| 136 | + |
| 137 | +```{code-cell} |
| 138 | +metadata = [ {'slim_id': k, 'is_null': False, 'genome_type': 0} |
| 139 | + for k in range(tables.nodes.num_rows) ] |
| 140 | +nms = tables.nodes.metadata_schema |
| 141 | +tables.nodes.packset_metadata( |
| 142 | + [nms.validate_and_encode_row(r) for r in metadata] |
| 143 | +) |
| 144 | +``` |
| 145 | + |
| 146 | +If speed is an issue, then ``encode_row`` can be substituted for ``validate_and_encode_row``, |
| 147 | +but at the risk of missing errors in metadata. |
| 148 | + |
| 149 | +**5.** the ``extract_X_metadata`` methods are not necessary, |
| 150 | +since the metadata in the tables of a TableCollection are automatically decoded. |
| 151 | +For instance, ``[ind.metadata["sex"] for ind in tables.individuals]`` will obtain |
| 152 | +a list of sexes of the individuals in the IndividualTable. |
| 153 | + |
| 154 | +:::{warning} |
| 155 | + It is our intention to remain backwards-compatible for a time. |
| 156 | + However, the legacy code will disappear at some point in the future, |
| 157 | + so please migrate over scripts you intend to rely on. |
| 158 | +::: |
| 159 | +======= |
| 160 | +>>>>>>> 483184a (deprecation start) |
0 commit comments