The aim for this tools is to process raw MD data and obtain standard structure and trajectory files. These files next to some additional author inputs are analyzed. Both standard files and analysis results are to be uploaded to the database using the loader. After that, results can be visualized on the MDposit client.
This will install the package and create the entry point mwf to be used when executing the workflow.
Note that internet access is required to install the workflow using this protocol. If you are working in a machine where there is not internet then head to the 'Installation in HPC' section.
First clone the workflow repo and create a new environment using the environment.yml file in this repo:
git clone https://github.com/mmb-irb/MDDB-workflow mwf
cd mwf
conda env create --file envs/environment.ymlActivate the new enviornment and then install the workflow module in development mode:
conda activate mwf_env
python -m pip install -e .At this point your environment is ready to use and it includes the mwf command. From now on, you can access the mwf command from anywhere in your computer as long as the mwf_env environment is activated.
In HPC clusters running conda to install the dependencies cannot be possible in cases where you have no internet access. For this cases you can follow the next steps to install the workflow in your remote machine.
When running the workflow, do not forget to exclude the tasks that need internet accesion by using the flag -i network.
Using conda-pack, you can create a portable conda environment that can be transferred and deployed on remote machines.
First install conda-pack in any environment in your local machine:
conda install conda-pack
Now install in your local machine the conda enviornment as it is explained in the previous section.
Note that there is no need to install the workflow module in this case.
Now pack the mwf environment with:
conda pack -n mwf_env --ignore-editable-packages
This will generate a file called 'mwf.tar.gz'. Copy this file in the remote machine where the workflow must me installed. Then go to that directory and run:
rsync -avP mwf_env.tar.gz <remote>:<path>
mkdir mwf_env
tar -xzf mwf_env.tar.gz -C mwf_envActivate the mwf environment:
source mwf_env/bin/activate
Make a conda unpack
conda-unpack
Now copy the whole workflow repository from your local machine to the remote machine.
git clone https://github.com/mmb-irb/MDDB-workflow.git mwf
rsync -avP mwf <remote>:<path>Then install it in develop mode with:
cd <path>
python -m pip install --no-build-isolation --no-index --no-deps -e .At this point your environment is ready to use and it includes the mwf command. From now on, you can access the mwf command from anywhere in your computer as long as the mwf_env environment is activated.
If you have pixi installed in your local machine, you can use it to create a portable conda environment that can be transferred and deployed on remote machines.
# Local machine
conda install pixi
pixi init -i envs/environment.yml
pixi lock
pixi-pack --create-executable
rsync -avP environment.sh <remote>:<path>
# Remote machine
./environment.sh
source activate.shNow copy the whole workflow repository from your local machine to the remote machine.
# Local machine
git clone https://github.com/mmb-irb/MDDB-workflow.git mwf
rsync -avP mwf <remote>:<path>
# Remote machine
cd <path>
python -m pip install --no-build-isolation --no-index --no-deps -e .Alternatively, you can use containerized versions of the workflow, which bundle all dependencies and the workflow itself into a portable image.
First, pull or build the container image on your local machine, then transfer it to the HPC cluster where you can run the workflow without requiring conda installations.
# Using docker
docker save -o mddb_wf.tar ghcr.io/mmb-irb/mddb_wf
scp mddb_wf.tar <remote>:<path>
docker run -v $PWD:/data -w /data ghcr.io/mmb-irb/mddb_wf mwf run
# Using singularity
singularity pull --name mddb_wf.sif docker://ghcr.io/mmb-irb/mddb_wf:latest
scp mddb_wf.sif <remote>:<path>
singularity run -H $PWD -C mddb_wf.sif mwf runFor detailed instructions on how to use the workflow, please refer to the usage guide.
For instructions about how to develop and test check the development guide.
The MDDB Workflow source code is provided under the terms of the Apache License version 2.0 (Apache-2.0).
The MDDB Workflow package (including the workflow with its required dependencies) is distributed under the terms of the GNU General Public License version 3 or any later version (GPL-3.0+).
