|
| 1 | +(structure-page)= |
| 2 | + |
| 3 | +# Structure |
| 4 | + |
| 5 | +## The `templates` directory |
| 6 | + |
| 7 | +The `templates` directory in the Nextflow project root can be used to store scripts. |
| 8 | + |
| 9 | +``` |
| 10 | +├── templates |
| 11 | +│ └── sayhello.py |
| 12 | +└── main.nf |
| 13 | +``` |
| 14 | + |
| 15 | +It allows custom scripts to be invoked like regular scripts from any process in your pipeline using the `template` function: |
| 16 | + |
| 17 | +``` |
| 18 | +process sayHello { |
| 19 | + |
| 20 | + input: |
| 21 | + val x |
| 22 | +
|
| 23 | + output: |
| 24 | + stdout |
| 25 | +
|
| 26 | + script: |
| 27 | + template 'sayhello.py' |
| 28 | +} |
| 29 | +
|
| 30 | +workflow { |
| 31 | + Channel.of("Foo") | sayHello | view |
| 32 | +} |
| 33 | +``` |
| 34 | + |
| 35 | +Variables prefixed with the dollar character (`$`) are interpreted as Nextflow variables when the template script is executed by Nextflow: |
| 36 | + |
| 37 | +``` |
| 38 | +#!/usr/bin/env python |
| 39 | +
|
| 40 | +print("Hello ${x}!") |
| 41 | +``` |
| 42 | + |
| 43 | +The pipeline will fail if a template variable is missing, regardless of where it occurs in the template. |
| 44 | + |
| 45 | +Templates can be tested independently of pipeline execution by providing each input as an environment variable. For example: |
| 46 | + |
| 47 | +```bash |
| 48 | +STR='foo' bash templates/my_script.sh |
| 49 | +``` |
| 50 | + |
| 51 | +Template scripts are only recommended for Bash scripts. Languages that do not prefix variables with `$` (e.g. Python and R) can't be executed directly as a template script from the command line as variables prefixed with `$` are interpreted as Bash variables. Similarly, template variables escaped with `\$` will be interpreted as Bash variables when executed by Nextflow but not the command line. |
| 52 | + |
| 53 | +:::{warning} |
| 54 | +Template variables are evaluated even if they are commented out in the template script. |
| 55 | +::: |
| 56 | + |
| 57 | +:::{tip} |
| 58 | +The best practice for using a custom script is to first embed it in the process definition and transfer it to a separate file with its own command line interface once the code matures. |
| 59 | +::: |
| 60 | + |
| 61 | +(bundling-executables)= |
| 62 | + |
| 63 | +## The `bin` directory |
| 64 | + |
| 65 | +The `bin` directory in the Nextflow project root can be used to store executable scripts. |
| 66 | + |
| 67 | +``` |
| 68 | +├── bin |
| 69 | +│ └── sayhello.py |
| 70 | +└── main.nf |
| 71 | +``` |
| 72 | + |
| 73 | +It allows custom scripts to be invoked like regular commands from any process in your pipeline without modifying the `PATH` environment variable or using an absolute path. Each script should include a shebang to specify the interpreter. Inputs should be supplied as arguments. |
| 74 | + |
| 75 | +```python |
| 76 | +#!/usr/bin/env python |
| 77 | + |
| 78 | +import argparse |
| 79 | + |
| 80 | +def main(): |
| 81 | + parser = argparse.ArgumentParser(description="A simple argparse example.") |
| 82 | + parser.add_argument("name", type=str, help="Person to greet.") |
| 83 | + |
| 84 | + args = parser.parse_args() |
| 85 | + print(f"Hello {args.name}!") |
| 86 | + |
| 87 | +if __name__ == "__main__": |
| 88 | + main() |
| 89 | +``` |
| 90 | + |
| 91 | +:::{tip} |
| 92 | +Use `env` to resolve the interpreter's location instead of hard-coding the interpreter path. |
| 93 | +::: |
| 94 | + |
| 95 | +Scripts placed in the `bin` directory must have executable permissions. Use `chmod` to grant the required permissions. For example: |
| 96 | + |
| 97 | +``` |
| 98 | +chmod a+x bin/sayhello.py |
| 99 | +``` |
| 100 | + |
| 101 | +Like modifying a process script, changing the executable script will cause the task to be re-executed on a resumed run. |
| 102 | + |
| 103 | +:::{warning} |
| 104 | +When using containers and the Wave service, Nextflow will send the project-level `bin` directory to the Wave service for inclusion as a layer in the container. Any changes to scripts in the `bin` directory will change the layer md5sum and the hash for the final container. The container identity is a component of the task hash calculation and will force re-calculation of all tasks in the workflow. |
| 105 | + |
| 106 | +When using the Wave service, use module-specific bin directories instead. See {ref}`module-binaries` for more information. |
| 107 | +::: |
| 108 | + |
| 109 | +## The `lib` directory |
| 110 | + |
| 111 | +The `lib` directory can be used to add utility code or external libraries without cluttering the pipeline scripts. The `lib` directory in the Nextflow project root is added to the classpath by default. |
| 112 | + |
| 113 | +``` |
| 114 | +├── lib |
| 115 | +│ └── DNASequence.groovy |
| 116 | +└── main.nf |
| 117 | +``` |
| 118 | + |
| 119 | +Classes or packages defined in the `lib` directory will be available in the execution context. Scripts or functions defined outside of classes will not be available in the execution context. |
| 120 | + |
| 121 | +For example, `lib/DNASequence.groovy` defines the `DNASequence` class: |
| 122 | + |
| 123 | +```groovy |
| 124 | +// lib/DNASequence.groovy |
| 125 | +class DNASequence { |
| 126 | + String sequence |
| 127 | +
|
| 128 | + // Constructor |
| 129 | + DNASequence(String sequence) { |
| 130 | + this.sequence = sequence.toUpperCase() // Ensure sequence is in uppercase for consistency |
| 131 | + } |
| 132 | +
|
| 133 | + // Method to calculate melting temperature using the Wallace rule |
| 134 | + double getMeltingTemperature() { |
| 135 | + int g_count = sequence.count('G') |
| 136 | + int c_count = sequence.count('C') |
| 137 | + int a_count = sequence.count('A') |
| 138 | + int t_count = sequence.count('T') |
| 139 | +
|
| 140 | + // Wallace rule calculation |
| 141 | + double tm = 4 * (g_count + c_count) + 2 * (a_count + t_count) |
| 142 | + return tm |
| 143 | + } |
| 144 | +
|
| 145 | + String toString() { |
| 146 | + return "DNA[$sequence]" |
| 147 | + } |
| 148 | +} |
| 149 | +``` |
| 150 | + |
| 151 | +The `DNASequence` class is available in the execution context: |
| 152 | + |
| 153 | +```nextflow |
| 154 | +// main.nf |
| 155 | +workflow { |
| 156 | + Channel.of('ACGTTGCAATGCCGTA', 'GCGTACGGTACGTTAC') |
| 157 | + .map { seq -> new DNASequence(seq) } |
| 158 | + .view { dna -> |
| 159 | + def meltTemp = dna.getMeltingTemperature() |
| 160 | + "Found sequence '$dna' with melting temperature ${meltTemp}°C" |
| 161 | + } |
| 162 | +} |
| 163 | +``` |
| 164 | + |
| 165 | +It returns: |
| 166 | + |
| 167 | +``` |
| 168 | +Found sequence 'DNA[ACGTTGCAATGCCGTA]' with melting temperaure 48.0°C |
| 169 | +Found sequence 'DNA[GCGTACGGTACGTTAC]' with melting temperaure 50.0°C |
| 170 | +``` |
0 commit comments