Skip to content

Improve software environment reproducibility #88

@transientlunatic

Description

@transientlunatic

At present asimov runs in a local python environment, typically either a python venv or a conda environment.
However, it does not record the software versions or environment details with each analysis, as would be required for full reproducibility.

In order to make asimov able to precisely reproduce an analysis we need two things

  • store the precise software environment (conda list or pip freeze) in the working directory and the results store so that it can be packaged (analysis packaging is for a future issue, but this information will be required)
  • The ability for asimov to create and control environments, allowing it to precisely reproduce an analysis

While allowing asimov to control software environments will be a valuable tool, we will need to think of a sensible way of managing these without using enormous amounts of storage; most, or potentially all analyses will use exactly the same environment, so having one environment per analysis is impractical.

An obvious option would be to run all analyses in versioned containers, and we should add support for this, however the need to build containers for simple workflows is also unpalatable to many users.

Metadata

Metadata

Labels

No labels
No labels

Projects

Status

Todo

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions