This workshop provides an introduction to bioinformatics workflow managers, with a focus on Nextflow. Participants will gain an understanding of workflow management concepts and learn how to use Nextflow to develop scalable, reproducible, and portable bioinformatics pipelines. The session will include an overview of key features, practical examples, and guidance on best practices for implementing workflows in a research context. The workshop is designed for those interested in improving the efficiency and reproducibility of their computational biology analyses.
We assume some minimal exposure to GitHub and Nextflow. To get the most out of the workshop we would recommend participants look over the following training material beforehand:
- Software Carpentry's Git training
- Seqera's "Hello Nextflow" training
In terms of software/hardware requirements, everything can be run locally on your laptop or on HPC as you prefer. You will need Nextflow and nf-core tools installed and a container engine such as Docker or Singularity.
In this workshop you will learn:
- How to compose a bioinformatics workflow in Nextflow using a combination of writing your own modules and utilising the incredible nf-core modules resource.
- How to manipulate channels to move data through your workflow like a pro.
- How to implement logic switches in your pipeline.
- How to implement unit tests with nf-test.
- An appreciation of when writing a workflow might be a good idea, and how layers of additional complexity can make your workflow more flexible and robust.
Everything you need is in the docs/Lecture.html slides on the main branch. The lecture walks you through a series of tasks to build a toy pipeline. The solutions to the tasks are in sequentially numbered branches.
The best way to make the most of the workshop is to fork this repo and clone it locally. This way, you can follow along and code on your own copy.
-
Fork the Repository:
- Navigate to the repo on GitHub and click the Fork button in the top-right corner to create a copy in your GitHub account. Make sure to fork all branches, not just the main.
-
Clone the Repository Locally:
- Go to your forked repo and click the Code button.
- Copy the URL under "Clone with HTTPS".
- In your terminal, run:
Replace
git clone https://github.com/YOUR_USERNAME/nextflow_in_action.git
YOUR_USERNAMEwith your GitHub username.
-
Navigate to the Repo:
- Once cloned, go into the repo directory:
cd nextflow_in_action
- Once cloned, go into the repo directory:
-
Switch branches as needed:
- To see what brach you're on:
git status
- To switch to a specific branch:
git checkout 1
- To see what brach you're on:
The materials in this repo are great for independent study. The lecture slides are on the main branch and outline exercises 1-7. Each corresponding branch (1-7) contains the solution code for that exercise.
It's best to attempt the exercises first, then refer to the solutions if needed. You can also explore the Nextflow and nf-core docs and training at Nextflow Training for extra help.
Start with Branch 1, which contains the skeleton for a mini-pipeline, and progress through the tasks as they gradually increase in complexity.
We'd love to hear if you implement our workshop and how it went! Drop us an email to tell us about it.
If you notice any mistakes, or would like to make a contribution please feel free to open an issue.