Skip to content

Python 1.0.0

Latest

Choose a tag to compare

@github-actions github-actions released this 27 Nov 14:27
· 6 commits to main since this release
f1b139e

🎉 Version 1 is here!!! 🥳

tskit development doesn't end here, but this marks the point at which:

Breaking changes will not be made except where it is unavoidable to correct incorrect behaviour or where they are forced by external factors such as dependencies

Full credit for this release and for tskit generally goes to the wonderful community of contributors, who you can see here: https://tskit.dev/software/tskit.html

Full changelog:

Breaking changes

  • The reference_sequence argument to TreeSequence.alignments is now
    required to be the same length as the tree sequence. Previously it was
    required to be the length of the requested interval.
    (@benjeffery, #3317)

  • TreeSequence.tables now returns a zero-copy immutable view of the tables.
    To get a mutable copy, use TreeSequence.dump_tables().
    (@benjeffery, #3288, #760)

  • For a tree sequence to be valid, the mutation parents in the table collection
    must be correct and consistent with the topology of the tree at each mutation site.
    TableCollection.tree_sequence() will raise a _tskit.LibraryError if this
    is not the case.
    (@benjeffery, #2729, #2732, #3212).

  • Drop Python 3.9 support and require Python >= 3.10.
    (#3267, @benjeffery)

  • ltrim, rtrim, trim and shift raise an error if they are
    used on a tree sequence containing a reference sequence.
    (@hyanwong, #3210, #2091)

Features

  • Add tskit.jit.numba.jitwrap and NumbaTreeSequence to allow simplified
    use and development of Numba-jitted functions with tree sequences. See the
    documentation <https://tskit.dev/tskit/docs/stable/numba.html>_ for details.
    (@andrewkern, #3295, #3294)

  • TreeSequence.map_to_vcf_model now also returns the transformed positions and
    contig length. (@benjeffery, #3174, #3173)

  • draw_svg() methods now associate tree branches with edge IDs.
    (@hyanwong, #3193, #557)

  • draw_svg() methods now allow the y-axis to be placed on the right-hand side
    using y_axis="right". (@hyanwong, #3201)

  • Add contig_id and isolated_as_missing to VcfModelMapping
    (@benjeffery, #3219, #3177).

  • Add TreeSequence.mutations_edge, which returns the edge ID for each mutation's
    edge. (@benjeffery, #3226, #3189)

  • Add TreeSequence.sites_ancestral_state, TreeSequence.mutations_derived_state and
    TreeSequence.mutations_inherited_state properties to return the ancestral state of sites,
    the derived state of mutations and the inherited state of mutations as NumPy arrays of
    the new NumPy 2.0 StringDType.
    (@benjeffery, #3228, #2632, #3276, #2631)

  • Tskit now requires NumPy version 2 or later. However, you can still use
    tskit with NumPy 1.x by building tskit from source with NumPy 1.x using
    pip install tskit --no-binary tskit. With NumPy 1.x, any use of the new
    StringDType properties will result in a RuntimeError. If you try to
    use another Python module that was compiled against NumPy 1.x with NumPy 2.x
    you may see the error "A module that was compiled using NumPy 1.x cannot be
    run in NumPy 2.0.0 as it may crash.". If no newer version of the module is
    available you will have to use the NumPy 1.x build as above.

  • Add Mutation.inherited_state property which returns the inherited state
    for a single mutation. (@benjeffery, #3277, #2631)

  • Add all_mutations and all_edges options to TreeSequence.union,
    allowing greater flexibility in "disjoint union" situations.
    (@hyanwong, @petrelharp, #3181)

  • Add TreeSequence.divergence_matrix, which was previously undocumented.

  • TreeSequence.variants, .genotype_matrix, .haplotypes, and .alignments methods
    now fully support isolated_as_missing behaviour with internal nodes. .alignments is
    also around 10% faster.
    (@benjeffery, #3313, #3317, #1896)

Bugfixes

  • In some tables with mutations out-of-order TableCollection.sort did not re-order
    the mutations so they formed a valid TreeSequence. TableCollection.sort and
    TableCollection.canonicalise now sort mutations by site, then time (if known),
    then the mutation's node's time, then number of descendant mutations
    (ensuring that parent mutations occur before children), then node, then
    their original order in the tables. (@benjeffery, #3257, #3253)

  • Fix bug in TreeSequence.genetic_relatedness_vector that previously ignored
    span_normalise: previously, span_normalise was always set to False;
    now the default is True in agreement with other statistics, so the returned
    values will change. (@petrelharp, #3300, #3241)

  • Fix bug in TreeSequence.pair_coalescence_counts when span_normalise=True
    and a window breakpoint falls within an internal missing interval.
    (@nspope, #3176, #3175)

  • Fix metadata schemas that are equal but have different byte representations not
    being considered equal when using TableCollection.assert_equals and
    Table.assert_equals.
    (@benjeffery, #3246, #3244)

  • k-way statistics no longer require k sample sets, allowing in particular
    "self" comparisons for TreeSequence.genetic_relatedness. This changes the
    error code returned in some situations.
    (@andrewkern, @petrelharp, #3235, #3055)

  • Fix UnboundLocalError in draw_svg() when using numeric max_time
    values with mutations over roots.
    (@benjeffery, #3274, #3273)

  • Prevent iterating over a TopologyCounter.
    (@benjeffery, #3202, #1462)

  • Fix TreeSequence.concatenate() to work with internal samples by using the
    all_mutations and all_edges parameters in union().
    (@hyanwong, #3283, #3181)