Skip to content

Commit e2a177b

Browse files
committed
✨ feat: add hello-containers training module
1 parent 78bc4f3 commit e2a177b

File tree

13 files changed

+621
-24
lines changed

13 files changed

+621
-24
lines changed

docs/hello_nextflow/03_hello_containers.md

Lines changed: 315 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,82 +1,373 @@
11
# Part 2: Hello Containers
22

3-
NOTE: THIS IS A PLACEHOLDER FOR MATERIAL THAT IS COMING SOON
4-
53
In Part 1, you learned how to use the basic building blocks of Nextflow to assemble a simple pipeline capable of processing some text and parallelizing execution if there were multiple inputs.
64

7-
However, you were limited to utilizing only basic UNIX tools that were already installed in your computing environment. Real-world work typically requires you to use all sorts of tools and packages that don't come standard. Usually you'd have to figure out how to install all of the necessary tools and their software dependencies, as well as manage any conflicts that may arise between dependencies that aren't compatible with each other.
5+
However, you were limited to utilizing only basic UNIX tools that were already installed in your computing environment.
6+
Real-world work typically requires you to use all sorts of tools and packages that don't come standard.
7+
Usually you'd have to figure out how to install all of the necessary tools and their software dependencies, as well as manage any conflicts that may arise between dependencies that aren't compatible with each other.
88

99
That is all very tedious and annoying, so we're going to show you how to use **containers** to solve this problem much more conveniently.
1010

11-
TODO [shortest possible summary of what are containers]
12-
1311
---
1412

1513
## 1. Use a container directly [basics]
1614

17-
TODO
15+
A **container** is a lightweight, standalone, executable unit of software created from a container **image** that includes everything needed to run an application including code, system libraries and settings.
16+
To use a container you usually download or "pull" a container image from a container registry, and then run the container image to create a container instance.
17+
18+
### 1.1. Pull the container image
19+
20+
```bash
21+
docker pull rancher/cowsay
22+
```
1823

19-
### 1.1. Pull the container
24+
### 1.2 Use the container to execute a single command
2025

21-
TODO
26+
The `docker run` command is used to spin up a container instance from a container image and execute a command in it.
27+
The `--rm` flag tells Docker to remove the container instance after the command has completed.
2228

2329
```bash
24-
docker pull <container>
30+
docker run --rm rancher/cowsay "Hello World"
31+
```
32+
33+
```console title="Output"
34+
_____________
35+
< Hello World >
36+
-------------
37+
\ ^__^
38+
\ (oo)\_______
39+
(__)\ )\/\
40+
||----w |
41+
|| ||
2542
```
2643

2744
### 1.2. Spin up the container interactively
2845

29-
TODO
46+
You can also run a container interactively, which will give you a shell prompt inside the container.
47+
The `--entrypoint` flag allows you to specify the command that should be run when the container starts up.
3048

3149
```bash
32-
docker run -it -v ./data:/data <container>
50+
docker run --rm -it --entrypoint /bin/sh rancher/cowsay
3351
```
3452

53+
Notice that the prompt has changed to `/ #`, which indicates that you are now inside the container.
54+
If we run:
55+
56+
```console title="Output"
57+
/ # ls
58+
bin dev etc home lib media mnt opt proc root run sbin srv sys tmp usr var
59+
```
60+
61+
You can see that the filesystem inside the container is different from the filesystem on your host system.
62+
3563
### 1.3. Run the command
3664

37-
TODO
65+
Now that you are inside the container, you can run the `cowsay` command directly.
3866

3967
```bash
40-
<example command>>
68+
cowsay "Hello World"
69+
```
70+
71+
Output:
72+
73+
```console title="Output"
74+
_____________
75+
< Hello World >
76+
-------------
77+
\ ^__^
78+
\ (oo)\_______
79+
(__)\ )\/\
80+
||----w |
81+
|| ||
4182
```
4283

4384
This should complete immediately, and you should now see a file called `<file>` in your working directory.
4485

45-
#### 1.4. Exit the container
86+
### 1.4. Exit the container
87+
88+
To exit the container, you can type `exit` at the prompt.
4689

4790
```bash
4891
exit
4992
```
5093

94+
Your prompt should now be back to what it was before you started the container.
95+
96+
### Takeaway
97+
98+
You know how to pull a container and run it interactively, and you know how to use that to try out commands without having to install any software on your system.
99+
100+
### What's next?
101+
102+
Learn how to make data from your host system available to a container.
103+
104+
## 2. Mounting data into containers
105+
106+
When you run a container, it is isolated from the host system by default.
107+
This means that the container can't access any files on the host system unless you explicitly tell it to.
108+
One way to do this is to **mount** a **volume** from the host system into the container.
109+
110+
Prior to working on the next section, confirm that you are in the `hello-nextflow` directory.
111+
112+
```bash
113+
cd /workspace/gitpod/nf-training/hello-nextflow
114+
```
115+
116+
### 2.1. Launch the container interactively with a mounted volume
117+
118+
```bash
119+
docker run --rm -it --entrypoint /bin/sh -v $(pwd)/data:/data rancher/cowsay
120+
```
121+
122+
This command mounts the `data` directory in the current working directory on the host system into the `/data` directory inside the container.
123+
We can explore the contents of the `/data` directory inside the container:
124+
125+
```console title="Output"
126+
/ # ls
127+
bin data dev etc home lib media mnt opt proc root run sbin srv sys tmp usr var
128+
/ # ls data
129+
bam greetings.csv ref sample_bams.txt samplesheet.csv
130+
```
131+
132+
### 2.2. Use the mounted data
133+
134+
Now that we have mounted the `data` directory into the container, we can use the `cowsay` command to display the contents of the `greetings.csv` file.
135+
136+
```bash
137+
cat data/greetings.csv | cowsay
138+
```
139+
140+
Output:
141+
142+
```console title="Output"
143+
_____________________
144+
< Hello,Bonjour,Holà >
145+
---------------------
146+
\ ^__^
147+
\ (oo)\_______
148+
(__)\ )\/\
149+
||----w |
150+
|| ||
151+
/ # exit
152+
```
153+
51154
### Takeaway
52155

53156
You know how to pull a container and run it interactively, and you know how to use that to try out commands without having to install any software on your system.
54157

55158
### What's next?
56159

57-
Learn how to use containers from within a workflow.
160+
Learn how to get a container image for any pip/conda-installable tool.
58161

59162
---
60163

61-
## 2. Use the container in a workflow
164+
## 2. Get a container image for a pip/conda-installable tool
165+
166+
Some software developers provide container images for their software that are available on container registries like Docker Hub, but many do not.
167+
New Docker images can be created by writing a `Dockerfile` that specifies how to build the image, but this can be a lot of work.
168+
Instead, we can use the Seqera Containers web service to create a container image for us.
62169

63-
TODO
170+
### 2.1. Navigate to Seqera Containers
171+
172+
Navigate to [Seqera Containers web service](https://www.seqera.io/containers/) and search for the `quote` pip package.
173+
174+
![Seqera Containers](img/seqera-containers-1.png)
175+
176+
### 2.2. Request a container image
177+
178+
Click on "+Add" and then "Get Container" to request a container image for the `quote` pip package.
179+
180+
![Seqera Containers](img/seqera-containers-2.png)
181+
182+
If this is the first time a community container has been built for this package, it may take a few minutes to complete.
183+
Click to copy the URI (e.g. `community.wave.seqera.io/library/pip_quote:25b3982790125217`) of the container image that was created for you.
184+
185+
### 2.3. Use the container image
186+
187+
You can now use the container image to run the `quote` command and get a random saying from Grace Hopper.
188+
189+
```bash
190+
docker run --rm community.wave.seqera.io/library/pip_quote:25b3982790125217 quote "Albert Einstein"
191+
```
192+
193+
Output:
194+
195+
```console title="Output"
196+
Humans are allergic to change. They love to say, 'We've always done it
197+
this way.' I try to fight that. That's why I have a clock on my wall
198+
that runs counter-clockwise.
199+
```
64200

65-
### 2.1. [step 1]
201+
### 2.4. STRETCH GOAL: Build the container image yourself
66202

67-
TODO
203+
Go back to the Seqera Containers website and click on the "Build Details" button.
204+
There you can see the details of how our container image was built.
205+
You can view the conda environment file and the `Dockerfile` together make up the recipe for the container image.
68206

69-
### 2.2. [step 2]
207+
```conda.yml
208+
channels:
209+
- conda-forge
210+
- bioconda
211+
dependencies:
212+
- pip
213+
- pip:
214+
- quote==2.0.4
215+
```
216+
217+
```Dockerfile
218+
FROM mambaorg/micromamba:1.5.10-noble
219+
COPY --chown=$MAMBA_USER:$MAMBA_USER conda.yml /tmp/conda.yml
220+
RUN micromamba install -y -n base -f /tmp/conda.yml \
221+
&& micromamba install -y -n base conda-forge::procps-ng \
222+
&& micromamba env export --name base --explicit > environment.lock \
223+
&& echo ">> CONDA_LOCK_START" \
224+
&& cat environment.lock \
225+
&& echo "<< CONDA_LOCK_END" \
226+
&& micromamba clean -a -y
227+
USER root
228+
ENV PATH="$MAMBA_ROOT_PREFIX/bin:$PATH"
229+
```
230+
231+
Copy the contents of these files into the stubs located in the `containers/build` directory, then run the following command to build the container image yourself.
232+
233+
```bash
234+
docker build -t quote:latest containers/build
235+
```
236+
237+
After it has finished building, you can run the container image you just built.
238+
239+
```bash
240+
docker run --rm quote:latest quote "Margaret Oakley Dayhoff"
241+
```
242+
243+
!Hint: Even if Seqera Containers doesn't manage to successfully build a container image for you, the `Dockerfile` and `conda.yml` are great starting point for a manual build.
244+
It often only takes a few additional `RUN` commands added to the Dockerfile to add the missing dependencies or system libraries.
245+
246+
### Takeaway
247+
248+
You know how to get a find/build a container image for any pip/conda installable tool using Seqera Containers.
249+
250+
### What's next?
251+
252+
Learn how to use containers in Nextflow.
253+
254+
---
255+
256+
## 3. Use containers in Nextflow
257+
258+
Nextflow has built-in support for running processes inside containers.
259+
This means that you can use any container image you like to run your processes, and Nextflow will take care of pulling the image, mounting the data, and running the process inside it.
260+
261+
### 3.1. Add a container directive to your process
262+
263+
Edit the `hello_containers.nf` script to add a `container` directive to the `cowsay` process.
264+
265+
_Before_:\_
266+
267+
```groovy title="hello-containers.nf"
268+
process COW_SAY {
269+
270+
publishDir 'containers/results', mode: 'copy'
271+
```
272+
273+
_After:_
274+
275+
```groovy title="hello-containers.nf"
276+
process cowSay {
277+
278+
publishDir 'containers/results', mode: 'copy'
279+
container 'community.wave.seqera.io/library/pip_cowsay:131d6a1b707a8e65'
280+
```
281+
282+
### 3.2. Run nextflow pipelines using containers
283+
284+
Run the script to see the container in action.
285+
286+
```bash
287+
nextflow run hello_containers.nf
288+
```
289+
290+
!NOTE
291+
The `nextflow.config` in our current working directory contains `docker.enabled = true`, which tells Nextflow to use Docker to run processes.
292+
Without that configuration we would have to specify the `-with-docker` flag when running the script.
293+
294+
### 3.3. Check the results
295+
296+
You should see a new directory called `containers/results` that contains the output of the `cowsay` process.
297+
298+
```console title="containers/results/cowsay-output-Bonjour.txt"
299+
_______
300+
| Bonjour |
301+
=======
302+
\
303+
\
304+
^__^
305+
(oo)\_______
306+
(__)\ )\/\
307+
||----w |
308+
|| ||
309+
```
310+
311+
### Takeaway
312+
313+
You know how to use containers in Nextflow to run processes.
314+
315+
### What's next?
316+
317+
An optional exercise to fetch quotes on computer/biology pioneers using the `quote` container and output them using the `cowsay` container.
318+
319+
## 4. OPTIONAL EXERCISE: Connect the `quote` container with the `cowsay` container
320+
321+
As an optional exercise, you can add a locally-built or Seqera Containers-built `quote` container to a getQuote process in the `hello_containers.nf` script and connect the output to the `cowsay` container.
322+
323+
### 4.1. Modify the `hello_containers.nf` script to use a getQuote process
324+
325+
We have a list of computer and biology pioneers in the `containers/data/pioneers.csv` file.
326+
At a high level, to complete this exercise you will need to:
327+
328+
- modify the `params.input_file` to point to the `pioneers.csv` file.
329+
- Create a `getQuote` process that uses the `quote` container to fetch a quote for each input.
330+
- Connect the output of the `getQuote` process to the `cowsay` process to display the quote.
331+
332+
!!! Hint
333+
334+
A good choice for the `script` block of your getQuote process might be:
335+
```groovy
336+
script:
337+
def safe_author = author.tokenize(' ').join('-')
338+
"""
339+
quote "$author" > quote-${safe_author}.txt
340+
"""
341+
```
342+
343+
You can find a solution to this exercise in `containers/scripts/hello-containers-4.1.nf`.
344+
345+
### 4.2. Modify your nextflow pipeline to allow it to execute in `quote` and `sayHello` modes.
346+
347+
Add some branching logic using to your pipeline to allow it to accept inputs intended for both `quote` and `sayHello`.
348+
Here's an example of how to use an `if` statement in a Nextflow workflow:
349+
350+
```groovy title="hello-containers.nf"
351+
workflow {
352+
if (params.quote) {
353+
...
354+
}
355+
else {
356+
...
357+
}
358+
cowSay(text_ch)
359+
}
360+
```
70361

71-
TODO
362+
!!! Hint
72363

73-
### 2.3. [step 3]
364+
You can use `new_ch = processName.out` to assign a name to the output channel of a process.
74365

75-
TODO
366+
You can find a solution to this exercise in `containers/scripts/hello-containers-4.2.nf`.
76367

77368
### Takeaway
78369

79-
You know how to use containers from within a workflow.
370+
You know how to use containers in Nextflow to run processes, and how to build some branching logic into your pipelines!
80371

81372
### What's next?
82373

128 KB
Loading
139 KB
Loading

hello-nextflow/containers/build/Dockerfile

Whitespace-only changes.

hello-nextflow/containers/build/conda.yml

Whitespace-only changes.

0 commit comments

Comments
 (0)