|
1 | 1 | # Part 2: Hello Containers |
2 | 2 |
|
3 | | -NOTE: THIS IS A PLACEHOLDER FOR MATERIAL THAT IS COMING SOON |
4 | | - |
5 | 3 | In Part 1, you learned how to use the basic building blocks of Nextflow to assemble a simple pipeline capable of processing some text and parallelizing execution if there were multiple inputs. |
6 | 4 |
|
7 | | -However, you were limited to utilizing only basic UNIX tools that were already installed in your computing environment. Real-world work typically requires you to use all sorts of tools and packages that don't come standard. Usually you'd have to figure out how to install all of the necessary tools and their software dependencies, as well as manage any conflicts that may arise between dependencies that aren't compatible with each other. |
| 5 | +However, you were limited to utilizing only basic UNIX tools that were already installed in your computing environment. |
| 6 | +Real-world work typically requires you to use all sorts of tools and packages that don't come standard. |
| 7 | +Usually you'd have to figure out how to install all of the necessary tools and their software dependencies, as well as manage any conflicts that may arise between dependencies that aren't compatible with each other. |
8 | 8 |
|
9 | 9 | That is all very tedious and annoying, so we're going to show you how to use **containers** to solve this problem much more conveniently. |
10 | 10 |
|
11 | | -TODO [shortest possible summary of what are containers] |
12 | | - |
13 | 11 | --- |
14 | 12 |
|
15 | 13 | ## 1. Use a container directly [basics] |
16 | 14 |
|
17 | | -TODO |
| 15 | +A **container** is a lightweight, standalone, executable unit of software created from a container **image** that includes everything needed to run an application including code, system libraries and settings. |
| 16 | +To use a container you usually download or "pull" a container image from a container registry, and then run the container image to create a container instance. |
| 17 | + |
| 18 | +### 1.1. Pull the container image |
| 19 | + |
| 20 | +```bash |
| 21 | +docker pull rancher/cowsay |
| 22 | +``` |
18 | 23 |
|
19 | | -### 1.1. Pull the container |
| 24 | +### 1.2 Use the container to execute a single command |
20 | 25 |
|
21 | | -TODO |
| 26 | +The `docker run` command is used to spin up a container instance from a container image and execute a command in it. |
| 27 | +The `--rm` flag tells Docker to remove the container instance after the command has completed. |
22 | 28 |
|
23 | 29 | ```bash |
24 | | -docker pull <container> |
| 30 | +docker run --rm rancher/cowsay "Hello World" |
| 31 | +``` |
| 32 | + |
| 33 | +```console title="Output" |
| 34 | + _____________ |
| 35 | +< Hello World > |
| 36 | + ------------- |
| 37 | + \ ^__^ |
| 38 | + \ (oo)\_______ |
| 39 | + (__)\ )\/\ |
| 40 | + ||----w | |
| 41 | + || || |
25 | 42 | ``` |
26 | 43 |
|
27 | 44 | ### 1.2. Spin up the container interactively |
28 | 45 |
|
29 | | -TODO |
| 46 | +You can also run a container interactively, which will give you a shell prompt inside the container. |
| 47 | +The `--entrypoint` flag allows you to specify the command that should be run when the container starts up. |
30 | 48 |
|
31 | 49 | ```bash |
32 | | -docker run -it -v ./data:/data <container> |
| 50 | +docker run --rm -it --entrypoint /bin/sh rancher/cowsay |
33 | 51 | ``` |
34 | 52 |
|
| 53 | +Notice that the prompt has changed to `/ #`, which indicates that you are now inside the container. |
| 54 | +If we run: |
| 55 | + |
| 56 | +```console title="Output" |
| 57 | +/ # ls |
| 58 | +bin dev etc home lib media mnt opt proc root run sbin srv sys tmp usr var |
| 59 | +``` |
| 60 | + |
| 61 | +You can see that the filesystem inside the container is different from the filesystem on your host system. |
| 62 | + |
35 | 63 | ### 1.3. Run the command |
36 | 64 |
|
37 | | -TODO |
| 65 | +Now that you are inside the container, you can run the `cowsay` command directly. |
38 | 66 |
|
39 | 67 | ```bash |
40 | | -<example command>> |
| 68 | +cowsay "Hello World" |
| 69 | +``` |
| 70 | + |
| 71 | +Output: |
| 72 | + |
| 73 | +```console title="Output" |
| 74 | + _____________ |
| 75 | +< Hello World > |
| 76 | + ------------- |
| 77 | + \ ^__^ |
| 78 | + \ (oo)\_______ |
| 79 | + (__)\ )\/\ |
| 80 | + ||----w | |
| 81 | + || || |
41 | 82 | ``` |
42 | 83 |
|
43 | 84 | This should complete immediately, and you should now see a file called `<file>` in your working directory. |
44 | 85 |
|
45 | | -#### 1.4. Exit the container |
| 86 | +### 1.4. Exit the container |
| 87 | + |
| 88 | +To exit the container, you can type `exit` at the prompt. |
46 | 89 |
|
47 | 90 | ```bash |
48 | 91 | exit |
49 | 92 | ``` |
50 | 93 |
|
| 94 | +Your prompt should now be back to what it was before you started the container. |
| 95 | + |
| 96 | +### Takeaway |
| 97 | + |
| 98 | +You know how to pull a container and run it interactively, and you know how to use that to try out commands without having to install any software on your system. |
| 99 | + |
| 100 | +### What's next? |
| 101 | + |
| 102 | +Learn how to make data from your host system available to a container. |
| 103 | + |
| 104 | +## 2. Mounting data into containers |
| 105 | + |
| 106 | +When you run a container, it is isolated from the host system by default. |
| 107 | +This means that the container can't access any files on the host system unless you explicitly tell it to. |
| 108 | +One way to do this is to **mount** a **volume** from the host system into the container. |
| 109 | + |
| 110 | +Prior to working on the next section, confirm that you are in the `hello-nextflow` directory. |
| 111 | + |
| 112 | +```bash |
| 113 | +cd /workspace/gitpod/nf-training/hello-nextflow |
| 114 | +``` |
| 115 | + |
| 116 | +### 2.1. Launch the container interactively with a mounted volume |
| 117 | + |
| 118 | +```bash |
| 119 | +docker run --rm -it --entrypoint /bin/sh -v $(pwd)/data:/data rancher/cowsay |
| 120 | +``` |
| 121 | + |
| 122 | +This command mounts the `data` directory in the current working directory on the host system into the `/data` directory inside the container. |
| 123 | +We can explore the contents of the `/data` directory inside the container: |
| 124 | + |
| 125 | +```console title="Output" |
| 126 | +/ # ls |
| 127 | +bin data dev etc home lib media mnt opt proc root run sbin srv sys tmp usr var |
| 128 | +/ # ls data |
| 129 | +bam greetings.csv ref sample_bams.txt samplesheet.csv |
| 130 | +``` |
| 131 | + |
| 132 | +### 2.2. Use the mounted data |
| 133 | + |
| 134 | +Now that we have mounted the `data` directory into the container, we can use the `cowsay` command to display the contents of the `greetings.csv` file. |
| 135 | + |
| 136 | +```bash |
| 137 | +cat data/greetings.csv | cowsay |
| 138 | +``` |
| 139 | + |
| 140 | +Output: |
| 141 | + |
| 142 | +```console title="Output" |
| 143 | + _____________________ |
| 144 | +< Hello,Bonjour,Holà > |
| 145 | + --------------------- |
| 146 | + \ ^__^ |
| 147 | + \ (oo)\_______ |
| 148 | + (__)\ )\/\ |
| 149 | + ||----w | |
| 150 | + || || |
| 151 | +/ # exit |
| 152 | +``` |
| 153 | + |
51 | 154 | ### Takeaway |
52 | 155 |
|
53 | 156 | You know how to pull a container and run it interactively, and you know how to use that to try out commands without having to install any software on your system. |
54 | 157 |
|
55 | 158 | ### What's next? |
56 | 159 |
|
57 | | -Learn how to use containers from within a workflow. |
| 160 | +Learn how to get a container image for any pip/conda-installable tool. |
58 | 161 |
|
59 | 162 | --- |
60 | 163 |
|
61 | | -## 2. Use the container in a workflow |
| 164 | +## 2. Get a container image for a pip/conda-installable tool |
| 165 | + |
| 166 | +Some software developers provide container images for their software that are available on container registries like Docker Hub, but many do not. |
| 167 | +New Docker images can be created by writing a `Dockerfile` that specifies how to build the image, but this can be a lot of work. |
| 168 | +Instead, we can use the Seqera Containers web service to create a container image for us. |
62 | 169 |
|
63 | | -TODO |
| 170 | +### 2.1. Navigate to Seqera Containers |
| 171 | + |
| 172 | +Navigate to [Seqera Containers web service](https://www.seqera.io/containers/) and search for the `quote` pip package. |
| 173 | + |
| 174 | + |
| 175 | + |
| 176 | +### 2.2. Request a container image |
| 177 | + |
| 178 | +Click on "+Add" and then "Get Container" to request a container image for the `quote` pip package. |
| 179 | + |
| 180 | + |
| 181 | + |
| 182 | +If this is the first time a community container has been built for this package, it may take a few minutes to complete. |
| 183 | +Click to copy the URI (e.g. `community.wave.seqera.io/library/pip_quote:25b3982790125217`) of the container image that was created for you. |
| 184 | + |
| 185 | +### 2.3. Use the container image |
| 186 | + |
| 187 | +You can now use the container image to run the `quote` command and get a random saying from Grace Hopper. |
| 188 | + |
| 189 | +```bash |
| 190 | +docker run --rm community.wave.seqera.io/library/pip_quote:25b3982790125217 quote "Albert Einstein" |
| 191 | +``` |
| 192 | + |
| 193 | +Output: |
| 194 | + |
| 195 | +```console title="Output" |
| 196 | +Humans are allergic to change. They love to say, 'We've always done it |
| 197 | +this way.' I try to fight that. That's why I have a clock on my wall |
| 198 | +that runs counter-clockwise. |
| 199 | +``` |
64 | 200 |
|
65 | | -### 2.1. [step 1] |
| 201 | +### 2.4. STRETCH GOAL: Build the container image yourself |
66 | 202 |
|
67 | | -TODO |
| 203 | +Go back to the Seqera Containers website and click on the "Build Details" button. |
| 204 | +There you can see the details of how our container image was built. |
| 205 | +You can view the conda environment file and the `Dockerfile` together make up the recipe for the container image. |
68 | 206 |
|
69 | | -### 2.2. [step 2] |
| 207 | +```conda.yml |
| 208 | +channels: |
| 209 | +- conda-forge |
| 210 | +- bioconda |
| 211 | +dependencies: |
| 212 | +- pip |
| 213 | +- pip: |
| 214 | + - quote==2.0.4 |
| 215 | +``` |
| 216 | + |
| 217 | +```Dockerfile |
| 218 | +FROM mambaorg/micromamba:1.5.10-noble |
| 219 | +COPY --chown=$MAMBA_USER:$MAMBA_USER conda.yml /tmp/conda.yml |
| 220 | +RUN micromamba install -y -n base -f /tmp/conda.yml \ |
| 221 | + && micromamba install -y -n base conda-forge::procps-ng \ |
| 222 | + && micromamba env export --name base --explicit > environment.lock \ |
| 223 | + && echo ">> CONDA_LOCK_START" \ |
| 224 | + && cat environment.lock \ |
| 225 | + && echo "<< CONDA_LOCK_END" \ |
| 226 | + && micromamba clean -a -y |
| 227 | +USER root |
| 228 | +ENV PATH="$MAMBA_ROOT_PREFIX/bin:$PATH" |
| 229 | +``` |
| 230 | + |
| 231 | +Copy the contents of these files into the stubs located in the `containers/build` directory, then run the following command to build the container image yourself. |
| 232 | + |
| 233 | +```bash |
| 234 | +docker build -t quote:latest containers/build |
| 235 | +``` |
| 236 | + |
| 237 | +After it has finished building, you can run the container image you just built. |
| 238 | + |
| 239 | +```bash |
| 240 | +docker run --rm quote:latest quote "Margaret Oakley Dayhoff" |
| 241 | +``` |
| 242 | + |
| 243 | +!Hint: Even if Seqera Containers doesn't manage to successfully build a container image for you, the `Dockerfile` and `conda.yml` are great starting point for a manual build. |
| 244 | +It often only takes a few additional `RUN` commands added to the Dockerfile to add the missing dependencies or system libraries. |
| 245 | + |
| 246 | +### Takeaway |
| 247 | + |
| 248 | +You know how to get a find/build a container image for any pip/conda installable tool using Seqera Containers. |
| 249 | + |
| 250 | +### What's next? |
| 251 | + |
| 252 | +Learn how to use containers in Nextflow. |
| 253 | + |
| 254 | +--- |
| 255 | + |
| 256 | +## 3. Use containers in Nextflow |
| 257 | + |
| 258 | +Nextflow has built-in support for running processes inside containers. |
| 259 | +This means that you can use any container image you like to run your processes, and Nextflow will take care of pulling the image, mounting the data, and running the process inside it. |
| 260 | + |
| 261 | +### 3.1. Add a container directive to your process |
| 262 | + |
| 263 | +Edit the `hello_containers.nf` script to add a `container` directive to the `cowsay` process. |
| 264 | + |
| 265 | +_Before_:\_ |
| 266 | + |
| 267 | +```groovy title="hello-containers.nf" |
| 268 | +process COW_SAY { |
| 269 | +
|
| 270 | + publishDir 'containers/results', mode: 'copy' |
| 271 | +``` |
| 272 | + |
| 273 | +_After:_ |
| 274 | + |
| 275 | +```groovy title="hello-containers.nf" |
| 276 | +process cowSay { |
| 277 | +
|
| 278 | + publishDir 'containers/results', mode: 'copy' |
| 279 | + container 'community.wave.seqera.io/library/pip_cowsay:131d6a1b707a8e65' |
| 280 | +``` |
| 281 | + |
| 282 | +### 3.2. Run nextflow pipelines using containers |
| 283 | + |
| 284 | +Run the script to see the container in action. |
| 285 | + |
| 286 | +```bash |
| 287 | +nextflow run hello_containers.nf |
| 288 | +``` |
| 289 | + |
| 290 | +!NOTE |
| 291 | +The `nextflow.config` in our current working directory contains `docker.enabled = true`, which tells Nextflow to use Docker to run processes. |
| 292 | +Without that configuration we would have to specify the `-with-docker` flag when running the script. |
| 293 | + |
| 294 | +### 3.3. Check the results |
| 295 | + |
| 296 | +You should see a new directory called `containers/results` that contains the output of the `cowsay` process. |
| 297 | + |
| 298 | +```console title="containers/results/cowsay-output-Bonjour.txt" |
| 299 | + _______ |
| 300 | +| Bonjour | |
| 301 | + ======= |
| 302 | + \ |
| 303 | + \ |
| 304 | + ^__^ |
| 305 | + (oo)\_______ |
| 306 | + (__)\ )\/\ |
| 307 | + ||----w | |
| 308 | + || || |
| 309 | +``` |
| 310 | + |
| 311 | +### Takeaway |
| 312 | + |
| 313 | +You know how to use containers in Nextflow to run processes. |
| 314 | + |
| 315 | +### What's next? |
| 316 | + |
| 317 | +An optional exercise to fetch quotes on computer/biology pioneers using the `quote` container and output them using the `cowsay` container. |
| 318 | + |
| 319 | +## 4. OPTIONAL EXERCISE: Connect the `quote` container with the `cowsay` container |
| 320 | + |
| 321 | +As an optional exercise, you can add a locally-built or Seqera Containers-built `quote` container to a getQuote process in the `hello_containers.nf` script and connect the output to the `cowsay` container. |
| 322 | + |
| 323 | +### 4.1. Modify the `hello_containers.nf` script to use a getQuote process |
| 324 | + |
| 325 | +We have a list of computer and biology pioneers in the `containers/data/pioneers.csv` file. |
| 326 | +At a high level, to complete this exercise you will need to: |
| 327 | + |
| 328 | +- modify the `params.input_file` to point to the `pioneers.csv` file. |
| 329 | +- Create a `getQuote` process that uses the `quote` container to fetch a quote for each input. |
| 330 | +- Connect the output of the `getQuote` process to the `cowsay` process to display the quote. |
| 331 | + |
| 332 | +!!! Hint |
| 333 | + |
| 334 | + A good choice for the `script` block of your getQuote process might be: |
| 335 | + ```groovy |
| 336 | + script: |
| 337 | + def safe_author = author.tokenize(' ').join('-') |
| 338 | + """ |
| 339 | + quote "$author" > quote-${safe_author}.txt |
| 340 | + """ |
| 341 | + ``` |
| 342 | + |
| 343 | +You can find a solution to this exercise in `containers/scripts/hello-containers-4.1.nf`. |
| 344 | + |
| 345 | +### 4.2. Modify your nextflow pipeline to allow it to execute in `quote` and `sayHello` modes. |
| 346 | + |
| 347 | +Add some branching logic using to your pipeline to allow it to accept inputs intended for both `quote` and `sayHello`. |
| 348 | +Here's an example of how to use an `if` statement in a Nextflow workflow: |
| 349 | + |
| 350 | +```groovy title="hello-containers.nf" |
| 351 | +workflow { |
| 352 | + if (params.quote) { |
| 353 | + ... |
| 354 | + } |
| 355 | + else { |
| 356 | + ... |
| 357 | + } |
| 358 | + cowSay(text_ch) |
| 359 | +} |
| 360 | +``` |
70 | 361 |
|
71 | | -TODO |
| 362 | +!!! Hint |
72 | 363 |
|
73 | | -### 2.3. [step 3] |
| 364 | + You can use `new_ch = processName.out` to assign a name to the output channel of a process. |
74 | 365 |
|
75 | | -TODO |
| 366 | +You can find a solution to this exercise in `containers/scripts/hello-containers-4.2.nf`. |
76 | 367 |
|
77 | 368 | ### Takeaway |
78 | 369 |
|
79 | | -You know how to use containers from within a workflow. |
| 370 | +You know how to use containers in Nextflow to run processes, and how to build some branching logic into your pipelines! |
80 | 371 |
|
81 | 372 | ### What's next? |
82 | 373 |
|
|
0 commit comments