|
| 1 | +--- |
| 2 | +layout: efflux |
| 3 | +title: "hstrat: a Python Package for phylogenetic inference on distributed digital evolution populations" |
| 4 | +date: 2022-11-07 |
| 5 | +permalink: "/pubs/:title" |
| 6 | +category: journal |
| 7 | +download: https://github.com/mmore500/hstrat/releases/download/as-submitted-joss/paper.pdf |
| 8 | +# doi: "10.1145/3520304.3533937" |
| 9 | +authors: |
| 10 | + - Matthew Andres Moreno |
| 11 | + - Emily Dolson |
| 12 | + - Charles Ofria |
| 13 | +venue: Journal of Open Source Science (Under Revision) |
| 14 | +projects: |
| 15 | + - hstrat |
| 16 | +abstract: | |
| 17 | + Digital evolution systems instantiate evolutionary processes over populations of virtual agents *in silico*. |
| 18 | + These programs can serve as rich experimental model systems. |
| 19 | + Insights from digital evolution experiments expand evolutionary theory, and can often directly improve heuristic optimization techniques . |
| 20 | + Perfect observability, in particular, enables *in silico* experiments that would be otherwise impossible *in vitro* or *in vivo*. |
| 21 | + Notably, availability of the full evolutionary history (phylogeny) of a given population enables very powerful analyses. |
| 22 | +
|
| 23 | + As a slow but highly parallelizable process, digital evolution will benefit greatly by continuing to capitalize on profound advances in parallel and distributed computing [@moreno2020practical;@ackley2014indefinitely], particularly emerging unconventional computing architectures [@ackley2011homeostatic;@lauterbach2021path;@furber2014spinnaker]. |
| 24 | + However, scaling up digital evolution presents many challenges. |
| 25 | + Among these is the existing centralized perfect-tracking phylogenetic data collection model, which is inefficient and difficult to realize in parallel and distributed contexts. |
| 26 | + Here, we implement an alternative approach to tracking phylogenies across vast and potentially unreliable hardware networks. |
| 27 | +
|
| 28 | + The `hstrat` Python library exists to facilitate application of hereditary stratigraphy, a cutting-edge technique to enable phylogenetic inference over distributed digital evolution populations. |
| 29 | + This technique departs from the traditional perfect-tracking approach to phylogenetic record-keeping. |
| 30 | + Instead, hereditary stratigraphy enables phylogenetic history to be inferred from heritable annotations attached to evolving digital agents. |
| 31 | + This approach aligns with phylogenetic reconstruction methodologies in evolutionary biology. |
| 32 | + Hereditary stratigraphy attaches a set of immutable historical "checkpoints" --- referred to as _strata_ --- as an annotation on evolving genomes. |
| 33 | + Checkpoints can be strategically discarded to reduce annotation size at the cost of increasing inference uncertainty. |
| 34 | + A particular strategy for which checkpoints to discard when is referred to as a _stratum retention policy_. |
| 35 | + We refer to the set of retained strata as a _hereditary stratigraphic column_. |
| 36 | +
|
| 37 | + Appropriate stratum retention policy choice varies by application. |
| 38 | + For example, if annotation size is not a concern it may be best to preserve all strata. |
| 39 | + In other situations, it may be necessary to constrain annotation size to remain within a fixed memory budget. |
| 40 | +
|
| 41 | + Key features of the library include: |
| 42 | +
|
| 43 | + - object-oriented hereditary stratigraphic column implementation to annotate arbitrary genomes, |
| 44 | + - modular interchangeability and user extensibility of stratum retention policies, |
| 45 | + - programmatic interface to query guarantees and behavior of stratum retention policy, |
| 46 | + - modular interchangeability and user extensibility of back-end data structure used to store annotation data, |
| 47 | + - a suite of visualization tools to elucidate stratum retention policies, |
| 48 | + - support for automatic parameterization of stratum retention policies to meet user size complexity or inference precision specifications, |
| 49 | + - tools to compare two columns and extract information about the phylogenetic relationship between them, |
| 50 | + - [extensive documentation](https://hstrat.readthedocs.io) hosted on [ReadTheDocs](https://readthedocs.io), |
| 51 | + - a comprehensive test suite to ensure stability and reliability, |
| 52 | + - convenient availability as a Python package via the [PyPI repository](https://pypi.org/), and |
| 53 | + - pure Python implementation to ensure universal portability. |
| 54 | +bibtex: |- |
| 55 | + @article{moreno2022hstrat, |
| 56 | + author = {Moreno, Matthew Andres and Dolson, Emily and Ofria, Charles}, |
| 57 | + title = "{hstrat: a Python Package for phylogenetic inference on distributed digital evolution populations}", |
| 58 | + journal = {Journal of Open Source Software}, |
| 59 | + year = {Under Revision}, |
| 60 | + } |
| 61 | +citation: "Matthew Andres Moreno, Emily Dolson, and Charles Ofria. hstrat: a Python Package for phylogenetic inference on distributed digital evolution populations. Journal of Open Source Software. Under Revision." |
| 62 | +--- |
0 commit comments