You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Major update to Nextflow for Genomics
This is highly experimental and may still change a lot before release
* Streamlined and updated
* Further edits for consistency and cleanup
* Added TODOs to improve the progression
* Main updates to per-sample variant calling lesson complete
* Updated Part 3 and solutions
* Updated the method overview page
* First pass pruning of the Sarek lesson
But it doesn't actually run on our own data
* Remove Sarek for now to avoid parser issue
Will bring it back later
* Update to satisfy #518
* Run pre-commit
---------
Co-authored-by: Phil Ewels <phil.ewels@seqera.io>
The training environment contains all the software, code and data necessary to work through this training course, so you don't need to install anything yourself.
4
-
However, you do need a (free) account to log in, and you should take a few minutes to familiarize yourself with the interface.
3
+
## Start a training environment
5
4
6
-
If you have not yet done so, please follow [this link](../../../envsetup/) before going any further.
5
+
To use the pre-built environment we provide on GitHub Codespaces, click the "Open in GitHub Codespaces" button below. For other options, see [Environment options](../../envsetup/index.md).
7
6
8
-
## Materials provided
7
+
We recommend opening the training environment in a new browser tab or window (use right-click, ctrl-click or cmd-click depending on your equipment) so that you can read on while the environment loads.
8
+
You will need to keep these instructions open in parallel to work through the course.
9
9
10
-
Throughout this training course, we'll be working in the `nf4-science/genomics/` directory, which you need to move into when you open the training workspace.
11
-
This directory contains all the code files, test data and accessory files you will need.
10
+
[](https://codespaces.new/nextflow-io/training?quickstart=1&ref=master)
12
11
13
-
Feel free to explore the contents of this directory; the easiest way to do so is to use the file explorer on the left-hand side of the training workspace in the VSCode interface.
12
+
### Environment basics
13
+
14
+
This training environment contains all the software, code and data necessary to work through the training course, so you don't need to install anything yourself.
15
+
16
+
The codespace is set up with a VSCode interface, which includes a filesystem explorer, a code editor and a terminal shell.
17
+
All instructions given during the course (e.g. 'open the file', 'edit the code' or 'run this command') refer to those three parts of the VScode interface unless otherwise specified.
18
+
19
+
If you are working through this course by yourself, please acquaint yourself with the [environment basics](../../envsetup/01_setup.md) for further details.
20
+
21
+
### Version requirements
22
+
23
+
This training is designed for Nextflow 25.10.2 or later **with the v2 syntax parser ENABLED**.
24
+
If you are using a local or custom environment, please make sure you are using the correct settings as documented [here](../../info/nxf_versions.md).
25
+
26
+
## Get ready to work
27
+
28
+
Once your codespace is running, there are two things you need to do before diving into the training: set your working directory for this specific course, and take a look at the materials provided.
29
+
30
+
### Set the working directory
31
+
32
+
By default, the codespace opens with the work directory set at the root of all training courses, but for this course, we'll be working in the `nf4-science/genomics/` directory.
33
+
34
+
Change directory now by running this command in the terminal:
35
+
36
+
```bash
37
+
cd nf4-science/genomics/
38
+
```
39
+
40
+
You can set VSCode to focus on this directory, so that only the relevant files show in the file explorer sidebar:
41
+
42
+
```bash
43
+
code .
44
+
```
45
+
46
+
!!! tip
47
+
48
+
If for whatever reason you move out of this directory (e.g. your codespace goes to sleep), you can always use the full path to return to it, assuming you're running this within the Github Codespaces training environment:
49
+
50
+
```bash
51
+
cd /workspaces/training/nf4-science/genomics
52
+
```
53
+
54
+
Now let's have a look at the contents.
55
+
56
+
### Explore the materials provided
57
+
58
+
You can explore the contents of this directory by using the file explorer on the left-hand side of the training workspace.
14
59
Alternatively, you can use the `tree` command.
60
+
15
61
Throughout the course, we use the output of `tree` to represent directory structure and contents in a readable form, sometimes with minor modifications for clarity.
16
62
17
63
Here we generate a table of contents to the second level down:
@@ -20,53 +66,52 @@ Here we generate a table of contents to the second level down:
20
66
tree . -L 2
21
67
```
22
68
23
-
If you run this inside `nf4-science/genomics`, you should see the following output:
24
-
25
-
```console title="Directory contents"
26
-
27
-
.
28
-
├── data
29
-
│ ├── bam
30
-
│ ├── ref
31
-
│ ├── sample_bams.txt
32
-
│ └── samplesheet.csv
33
-
├── genomics-1.nf
34
-
├── genomics-2.nf
35
-
├── genomics-3.nf
36
-
├── genomics-4.nf
37
-
├── nextflow.config
38
-
└── solutions
39
-
├── modules
40
-
├── nf-test.config
41
-
└── tests
42
-
43
-
6 directories, 8 files
44
-
45
-
```
69
+
??? abstract "Directory contents"
46
70
47
-
!!!note
71
+
```console
72
+
.
73
+
├── data
74
+
│ ├── bam
75
+
│ ├── ref
76
+
│ ├── sample_bams.txt
77
+
│ └── samplesheet.csv
78
+
├── genomics.nf
79
+
├── modules
80
+
│ ├── gatk_haplotypecaller.nf
81
+
│ └── samtools_index.nf
82
+
├── nextflow.config
83
+
└── solutions
84
+
├── modules
85
+
├── nf-test.config
86
+
├── part2
87
+
└── tests
88
+
89
+
8 directories, 8 files
90
+
```
48
91
49
-
Don't worry if this seems like a lot; we'll go through the relevant pieces at each step of the course.
50
-
This is just meant to give you an overview.
92
+
Click on the colored box to expand the section and view its contents.
93
+
We use collapsible sections like this to display expected command output as well as directory and file contents in a concise way.
51
94
52
-
**Here's a summary of what you should know to get started:**
95
+
-**The `genomics.nf` file** is a workflow script that you'll build up over the course.
53
96
54
-
-**The `.nf` files**are workflow scripts that are named based on what part of the course they're used in.
97
+
-**The `modules` directory**contains skeleton module files that you'll fill in during the course.
55
98
56
99
-**The file `nextflow.config`** is a configuration file that sets minimal environment properties.
57
100
You can ignore it for now.
58
101
59
102
-**The `data` directory** contains input data and related resources, described later in the course.
60
103
61
-
-**The `solutions` directory** contains module files and test configurations that result from Parts 3 and 4 of the course.
104
+
-**The `solutions` directory** contains completed module files and a Part 2 solution that can serve as a starting point for Part 3.
62
105
They are intended to be used as a reference to check your work and troubleshoot any issues.
63
106
64
-
!!!tip
107
+
## Readiness checklist
65
108
66
-
If for whatever reason you move out of this directory, you can always run this command to return to it:
109
+
Think you're ready to dive in?
67
110
68
-
```bash
69
-
cd /workspaces/training/nf4-science/genomics
70
-
```
111
+
-[ ] I understand the goal of this course and its prerequisites
112
+
-[ ] My environment is up and running
113
+
-[ ] I've set my working directory appropriately
114
+
115
+
If you can check all the boxes, you're good to go.
71
116
72
-
Now, to begin the course, click on the arrow in the bottom right corner of this page.
117
+
**To continue to [Part 1: Method overview and manual testing](./01_method.md), click on the arrow in the bottom right corner of this page.**
0 commit comments