Skip to content

Commit 093cc0e

Browse files
committed
Merge branch 'master' of github.com:nextflow-io/website into master
2 parents 56d5e4e + 01e57ef commit 093cc0e

File tree

2 files changed

+132
-0
lines changed

2 files changed

+132
-0
lines changed

assets/img/abhinav.jpg

25 KB
Loading
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
title=The Nextflow CLI - tricks and treats!
2+
date=2020-10-22
3+
type=post
4+
tags=nextflow,docs
5+
status=published
6+
author=Abhinav Sharma
7+
icon=abhinav.jpg
8+
~~~~~~
9+
10+
For most developers, the command line is synonymous with agility. While tools such as [Nextflow Tower](https://tower.nf) are opening up the ecosystem to a whole new set of users, the Nextflow CLI remains a bedrock for pipeline development. The CLI in Nextflow has been the core interface since the beginning; however, its full functionality was never extensively documented. Today we are excited to release the first iteration of the CLI documentation available on the [Nextflow website](https://www.nextflow.io/docs/edge/cli.html).
11+
12+
And given Halloween is just around the corner, in this blog post we'll take a look at 5 CLI tricks and examples which will make your life easier in designing, executing and debugging data pipelines. We are also giving away 5 limited-edition Nextflow hoodies and sticker packs so you can code in style this Halloween season!
13+
14+
15+
#### 1: Invoke a remote pipeline execution with the latest revision
16+
17+
Nextflow facilitates easy collaboration and re-use of existing pipelines in multiple ways. One of the simplest ways to do this is to use the URL of the Git repository.
18+
19+
```
20+
$ nextflow run https://www.github.com/nextflow-io/hello
21+
```
22+
23+
When executing a pipeline using the run command, it first checks to see if it has been previously downloaded in the ~/.nextflow/assets directory, and if so, Nextflow uses this to execute the pipeline. If the pipeline is not already cached, Nextflow will download it, store it in the `$HOME/.nextflow/` directory and then launch the execution.
24+
25+
How can we make sure that we always run the latest code from the remote pipeline? We simply need to add the `-latest` option to the run command, and Nextflow takes care of the rest.
26+
27+
```
28+
$ nextflow run nextflow-io/hello -latest
29+
```
30+
31+
#### 2: Query work directories for a specific execution
32+
33+
For every invocation of Nextflow, all the metadata about an execution is stored including task directories, completion status and time etc. We can use the `nextflow log` command to generate a summary of this information for a specific run.
34+
35+
To see a list of work directories associated with a particular execution (for example, `tiny_leavitt`), use:
36+
37+
```
38+
$ nextflow log tiny_leavitt
39+
```
40+
41+
To filter out specific process-level information from the logs of any execution, we simply need to use the fields (-f) option and specify the fields.
42+
43+
```
44+
$ nextflow log tiny_leavitt –f 'process, hash, status, duration'
45+
```
46+
47+
The hash is the name of the work directory where the process was executed; therefore, the location of a process work directory would be something like `work/74/68ff183`.
48+
49+
The log command also has other child options including `-before` and `-after` to help with the chronological inspection of logs.
50+
51+
52+
#### 3: Top-level configuration
53+
54+
Nextflow emphasizes customization of pipelines and exposes multiple options to facilitate this. The configuration is applied to multiple Nextflow commands and is therefore a top-level option. In practice, this means specifying configuration options *before* the command.
55+
56+
Nextflow CLI provides two kinds of config overrides - the soft override and the hard override.
57+
58+
The top-level soft override "-c" option allows us to change the previous config in an additive manner, overriding only the fields included the configuration file.
59+
60+
```
61+
$ nextflow -c my.config run nextflow-io/hello
62+
```
63+
64+
On the other hand, the hard override `-C` completely replaces and ignores any additional configurations.
65+
66+
$ nextflow –C my.config nextflow-io/hello
67+
68+
Moreover, we can also use the config command to inspect the final inferred configuration and view any profiles.
69+
70+
```
71+
$ nextflow config -show-profiles
72+
```
73+
74+
#### 4: Passing in an input parameter file
75+
76+
Nextflow is designed to work across both research and production settings. In production especially, specifying multiple parameters for the pipeline on the command line becomes cumbersome. In these cases, environment variables or config files are commonly used which contain all input files, options and metadata. Love them or hate them, YAML and JSON are the standard formats for human and machines, respectively.
77+
78+
The Nextflow run option `-params-file` can be used to pass in a file containing parameters in either format.
79+
80+
```
81+
$ nextflow run nextflow-io/rnaseq -params-file run_42.yaml
82+
```
83+
84+
The YAML file could contain the following.
85+
86+
```
87+
reads : "s3://gatk-data/run_42/reads/*_R{1,2}_*.fastq.gz"
88+
bwa_index : "$baseDir/index/*.bwa-index.tar.gz"
89+
paired_end : true
90+
penalty : 12
91+
```
92+
93+
#### 5: Specific workflow entry points
94+
95+
The recently released [DSL2](https://www.nextflow.io/blog/2020/dsl2-is-here.html) adds powerful modularity to Nextflow and enables scripts to contain multiple workflows. By default, the unnamed workflow is assumed to be the main entry point for the script, however, with numerous named workflows, the entry point can be customized by using the `entry` child-option of the run command.
96+
97+
$ nextflow run main.nf -entry workflow1
98+
99+
This allows users to run a specific sub-workflow or a section of their entire workflow script. For more information, refer to the [implicit workflow](https://www.nextflow.io/docs/latest/dsl2.html#implicit-workflow) section of the documentation.
100+
101+
102+
#### Bonus trick: Web dashboard launched from the CLI
103+
104+
The tricks above highlight the functionality of the Nextflow CLI. However, for long-running workflows, monitoring becomes all the more crucial. With Nextflow Tower, we can invoke any Nextflow pipeline execution from the CLI and use the integrated dashboard to follow the workflow execution wherever we are. Sign-in to [Tower](https://tower.nf) using your GitHub credentials, obtain your token from the Getting Started page and export them into your terminal, `~/.bashrc` or include them in your `nextflow.config`.
105+
106+
```
107+
$ export TOWER_ACCESS_TOKEN=my-secret-tower-key
108+
$ export NXF_VER=20.07.1
109+
```
110+
111+
Next simply add the "-with-tower" child-option to any Nextflow run command. A URL with the monitoring dashboard will appear.
112+
113+
```
114+
$ nextflow run nextflow-io/hello -with-tower
115+
```
116+
117+
#### Nextflow Giveaway
118+
119+
If you want to look stylish while you put the above tips into practice, or simply like free stuff, we are giving away five of our latest Nextflow hoodie and sticker packs. Retweet or like the Nextflow tweet about this article and we will draw and notify the winners on October 31st!
120+
121+
122+
#### About the Author
123+
124+
[Abhinav Sharma](https://www.linkedin.com/in/abhi18av/) is a Bioinformatics Engineer at [Seqera Labs](https://www.seqera.io) interested in Data Science and Cloud Engineering. He enjoys working on all things Genomics, Bioinformatics and Nextflow.
125+
126+
127+
#### Acknowledgements
128+
129+
Shout out to [Kevin Sayers](https://github.com/KevinSayers) and [Alexander Peltzer](https://github.com/apeltzer) for their earlier efforts in documenting the CLI and which inspired this work.
130+
131+
132+
*The latest CLI docs can be found in the edge release docs at https://www.nextflow.io/docs/edge/cli.html.*

0 commit comments

Comments
 (0)