Skip to content

Commit b65fc66

Browse files
ewelschristopher-hakkaartpditommaso
authored
Docs: Improve Conda docs - PyPI + lock files (#5531) [ci skip]
Signed-off-by: Phil Ewels <[email protected]> Co-authored-by: Christopher Hakkaart <[email protected]> Co-authored-by: Paolo Di Tommaso <[email protected]>
1 parent b5c63a9 commit b65fc66

File tree

2 files changed

+62
-17
lines changed

2 files changed

+62
-17
lines changed

docs/conda.md

Lines changed: 60 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
Nextflow has built-in support for Conda that allows the configuration of workflow dependencies using Conda recipes and environment files.
88

9-
This allows Nextflow applications to use popular tool collections such as [Bioconda](https://bioconda.github.io) whilst taking advantage of the configuration flexibility provided by Nextflow.
9+
This allows Nextflow applications to use popular tool collections such as [Bioconda](https://bioconda.github.io) and the [Python Package index](https://pypi.org/), whilst taking advantage of the configuration flexibility provided by Nextflow.
1010

1111
## Prerequisites
1212

@@ -22,7 +22,7 @@ Dependencies are specified by using the {ref}`process-conda` directive, providin
2222
Conda environments are stored on the file system. By default, Nextflow instructs Conda to save the required environments in the pipeline work directory. The same environment may be created/saved multiple times across multiple executions when using different work directories.
2323
:::
2424

25-
You can specify the directory where the Conda environments are stored using the `conda.cacheDir` configuration property. When using a computing cluster, make sure to use a shared file system path accessible from all compute nodes. See the {ref}`configuration page <config-conda>` for details about Conda configuration.
25+
You can specify the directory where the Conda environments are stored using the `conda.cacheDir` configuration property. When using a computing cluster, make sure to use a shared file system path accessible from all compute nodes. See the {ref}`configuration page <config-conda>` for details about Conda configuration.
2626

2727
:::{warning}
2828
The Conda environment feature is not supported by executors that use remote object storage as a work directory. For example, AWS Batch.
@@ -62,6 +62,7 @@ The usual Conda package syntax and naming conventions can be used. The version o
6262

6363
The name of the channel where a package is located can be specified prefixing the package with the channel name as shown here `bioconda::bwa=0.7.15`.
6464

65+
(conda-env-files)=
6566
### Use Conda environment files
6667

6768
Conda environments can also be defined using one or more Conda environment files. This is a file that lists the required packages and channels structured using the YAML format. For example:
@@ -77,20 +78,6 @@ dependencies:
7778
- bwa=0.7.15
7879
```
7980
80-
This other example shows how to leverage a Conda environment file to install Python packages from the [PyPI repository](https://pypi.org/)), through the `pip` package manager (which must also be explicitly listed as a required package):
81-
82-
```yaml
83-
name: my-env-2
84-
channels:
85-
- defaults
86-
dependencies:
87-
- pip
88-
- pip:
89-
- numpy
90-
- pandas
91-
- matplotlib
92-
```
93-
9481
Read the Conda documentation for more details about how to create [environment files](https://conda.io/docs/user-guide/tasks/manage-environments.html#creating-an-environment-file-manually).
9582
9683
The path of an environment file can be specified using the `conda` directive:
@@ -110,7 +97,26 @@ process foo {
11097
The environment file name **must** have a `.yml` or `.yaml` extension or else it won't be properly recognised.
11198
:::
11299

113-
Alternatively, it is possible to provide the dependencies using a plain text file, just listing each package name as a separate line. For example:
100+
(conda-pypi)=
101+
### Python Packages from PyPI
102+
103+
Conda environment files can also be used to install Python packages from the [PyPI repository](https://pypi.org/), through the `pip` package manager (which must also be explicitly listed as a required package):
104+
105+
```yaml
106+
name: my-env-2
107+
channels:
108+
- defaults
109+
dependencies:
110+
- pip
111+
- pip:
112+
- numpy
113+
- pandas
114+
- matplotlib
115+
```
116+
117+
### Conda text files
118+
119+
It is possible to provide dependencies by listing each package name as a separate line in a plain text file. For example:
114120

115121
```
116122
bioconda::star=2.5.4a
@@ -122,6 +128,43 @@ bioconda::multiqc=1.4
122128
Like before, the extension matters. Make sure the dependencies file has a `.txt` extension.
123129
:::
124130

131+
### Conda lock files
132+
133+
The final way to provide packages to Conda is with [Conda lock files](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#identical-conda-envs).
134+
135+
These are generated from existing Conda environments using the following command:
136+
137+
```bash
138+
conda list --explicit > spec-file.txt
139+
```
140+
141+
or if using Mamba / Micromamba:
142+
143+
```bash
144+
micromamba env export --explicit > spec-file.txt
145+
```
146+
147+
Conda lock files can also be downloaded from [Wave](https://seqera.io/wave/) build pages.
148+
149+
These files include every package and their dependencies. As such, no Conda environment resolution step is needed. This is faster and more reproducible.
150+
151+
The files contain package URLs and an optional md5hash for each download to confirm identity:
152+
153+
```
154+
# micromamba env export --explicit
155+
# This file may be used to create an environment using:
156+
# $ conda create --name <env> --file <this file>
157+
# platform: linux-64
158+
@EXPLICIT
159+
https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81
160+
https://conda.anaconda.org/conda-forge/linux-64/libgomp-13.2.0-h77fa898_7.conda#abf3fec87c2563697defa759dec3d639
161+
https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_gnu.tar.bz2#73aaf86a425cc6e73fcf236a5a46396d
162+
https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-13.2.0-h77fa898_7.conda#72ec1b1b04c4d15d4204ece1ecea5978
163+
# .. and so on
164+
```
165+
166+
To use with Nextflow, simply set the `conda` directive to the lock file path.
167+
125168
### Use existing Conda environments
126169

127170
If you already have a local Conda environment, you can use it in your workflow specifying the installation directory of such environment by using the `conda` directive:

docs/wave.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,8 @@ conda.channels = 'conda-forge,bioconda'
9090
```
9191
:::
9292

93+
Packages from the [Python Package Index](https://pypi.org/) can also be added to a Conda `environment.yml` file. See {ref}`Conda and PyPI <conda-pypi>` for more information.
94+
9395
(wave-singularity)=
9496

9597
### Build Singularity native images

0 commit comments

Comments
 (0)