Skip to content

Releasing ISF vs using ISF in the lab #285

@bgmeulem

Description

@bgmeulem

@abast recently hit the nail on the head by saying that "releasing ISF and using ISF in the lab are two diverging objectives" (loosely paraphrased).

The more I work with pixi, the more I feel like this could solve our issue of these two diverging objectives. A small recap of what the issue exactly is.

The problem

For use within our lab, we need to pin down some package versions. msgpack being the most prevalent, but it also requires us to pin down e.g. neuron, flask, pyyaml and numpy. Essentially, a large part of the Python ecosystem needs to be frozen for us to be able to work with older code and older data. Dropping support for our older code/data is not really an option. We have terabytes of useful data, and reproducing results is quite high on the priority list.

On the other hand, releasing ISF requires me to make it user-friendly, and preferably also future-proof. I am mostly worried about the "future-proof" part. Sure, we can enforce users to only use ISF in the specific same way that we do it, but this has proven to be a bit of a pain at best, and nigh impossible at worst. Some examples:

  1. installing ISF cross-platform (MacOS) is a request I've had twice. Many older package version (like neuron) were only available through the usual repo's at later versions. Simply installing ISF on more recent versions of linux, or recent versions of gcc is also not possible with the usual installation method (see issue Installation breaks on newer distros when using anaconda #258). This is already solved with pixi though.
  2. Building documentation with sphinx-autoapi requires a version of flask that is less ancient.
  3. I am not a fan of upgrading just for the sake if things being newer. However, not being able to upgrade is a different issue entirely. Whatever new or useful functionality comes out, there's a good chance we may not be able to use it because we're running old versions of integral packages that we can not upgrade: numpy, pandas and NEURON. Having access to package upgrades, if possible, can massively improve functionality, performance, and usability. Simply updating our neuron version could already provide a simulation speedup without too much hassle.

The solution

pixi allows to define separate environments through the use of separate features. One feature can e.g. be "building documentation". One feature can be "actually running ISF code" (which would then be the default feature). I'm already using this feature to specify an older version of flask for actually running ISF code (i.e. the default environment), and a newer version for building the documentation. And here comes my idea: what if one feature was "being able to read in old data and run older code".

The way this would work is as follows:

  1. We define a "core" feature for ISF, which includes all the core packages. As many packages as possible go here.
  2. We define a "default" or "isf" or "release" feature, with (if we want) updated code for neuron, pandas, numpy etc., and without pandas-msgpack.
  3. We define an "ibs" feature for reading msgpack data and running older code. We can even define a "ibs-legacy" (name is a WIP) for Py2 code if we want to.
  4. Only the packages that need to be different between the two, go in either category. E.g. the "release" feature will include a new numpy, and the "legacy" feature will contain pandas 1.1.3 and pandas-msgpack.
  5. We define an environment for "release" or "default"by including ["core", "release"], we define an environment for IBS by including ["core", "ibs"], and we define a documentation env with ["core", "docs"] features.
  6. We adapt the tests so that msgpack tests only need to pass for the "ibs" environment, and reproducibility tests can be relaxed for numerical truncation for the "release" version, but must be exact (like it is now) for the "ibs" version.

If this works, the IBS version would not need to be separated from an ISF release. IBS and ISF-"release" can happily co-exist. I argue that this way:

  1. allows us to exactly control which aspects of ISF are specific to our use-case, and which aspects are ready for release
  2. The end-user never needs to worry about this, until they implement a code-change that messes up our usecase. At this point, the test suite will complain.
  3. we do not need to maintain two separate codebases

I can make a PR of this if we agree that this is the way forward.

Related issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions