Skip to content
This repository was archived by the owner on Aug 9, 2023. It is now read-only.

Commit 3f796ec

Browse files
committed
additional docs on custom job definitions
clarify how to map to external scratch and maintain compatibility with nextflow
1 parent 6d6effa commit 3f796ec

File tree

1 file changed

+119
-1
lines changed

1 file changed

+119
-1
lines changed

docs/orchestration/nextflow/nextflow-overview.md

Lines changed: 119 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -287,11 +287,129 @@ process hello {
287287
}
288288
```
289289

290-
For each process in your workflow, Nextflow will create a corresponding Batch Job Definition, which it will re-use for subsequent workflow runs. You can customize these job definitions to incorporate additional environment variables or volumes/mount points as needed.
290+
For each process in your workflow, Nextflow will create a corresponding Batch Job Definition, which it will re-use for subsequent workflow runs. The process defined above will create a Batch Job Definition called `nf-ubuntu-latest` that looks like:
291+
292+
```json
293+
{
294+
"jobDefinitionName": "nf-ubuntu-latest",
295+
"jobDefinitionArn": "arn:aws:batch:<region>:<account-number>:job-definition/nf-ubuntu-latest:1",
296+
"revision": 1,
297+
"status": "ACTIVE",
298+
"type": "container",
299+
"parameters": {
300+
"nf-token": "43869867b5fbae16fa7cfeb5ea2c3522"
301+
},
302+
"containerProperties": {
303+
"image": "ubuntu:latest",
304+
"vcpus": 1,
305+
"memory": 1024,
306+
"command": [
307+
"true"
308+
],
309+
"volumes": [
310+
{
311+
"host": {
312+
"sourcePath": "/home/ec2-user/miniconda"
313+
},
314+
"name": "aws-cli"
315+
}
316+
],
317+
"environment": [],
318+
"mountPoints": [
319+
{
320+
"containerPath": "/home/ec2-user/miniconda",
321+
"readOnly": true,
322+
"sourceVolume": "aws-cli"
323+
}
324+
],
325+
"ulimits": []
326+
}
327+
}
328+
```
329+
330+
You can customize these job definitions to incorporate additional environment variables or volumes/mount points as needed.
291331

292332
!!! important
293333
In order to take advantage of automatically [expandable scratch space](/core-env/create-custom-compute-resources/) in the host instance, you will need to modify Nextflow created job definitions to map a container volume from `/scratch` on the host to `/tmp` in the container.
294334

335+
For example, a customized job definition for the process above that maps `/scratch` on the host to `/scratch` in the container and still work with Nextflow would be:
336+
337+
```json
338+
{
339+
"jobDefinitionName": "nf-ubuntu-latest",
340+
"jobDefinitionArn": "arn:aws:batch:<region>:<account-number>:job-definition/nf-ubuntu-latest:2",
341+
"revision": 2,
342+
"status": "ACTIVE",
343+
"type": "container",
344+
"parameters": {
345+
"nf-token": "43869867b5fbae16fa7cfeb5ea2c3522"
346+
},
347+
"containerProperties": {
348+
"image": "ubuntu:latest",
349+
"vcpus": 1,
350+
"memory": 1024,
351+
"command": [
352+
"true"
353+
],
354+
"volumes": [
355+
{
356+
"host": {
357+
"sourcePath": "/home/ec2-user/miniconda"
358+
},
359+
"name": "aws-cli"
360+
},
361+
{
362+
"host": {
363+
"sourcePath": "/scratch"
364+
},
365+
"name": "scratch"
366+
}
367+
],
368+
"environment": [],
369+
"mountPoints": [
370+
{
371+
"containerPath": "/home/ec2-user/miniconda",
372+
"readOnly": true,
373+
"sourceVolume": "aws-cli"
374+
},
375+
{
376+
"containerPath": "/scratch",
377+
"sourceVolume": "scratch"
378+
}
379+
],
380+
"ulimits": []
381+
}
382+
}
383+
```
384+
385+
Nextflow will use the most recent revision of a Job Definition.
386+
387+
You can also predefine Job Definitions that leverage extra volume mappings and refer to them in the process definition. Assuming you had an existing Job Definition named `say-hello`, a process definition that utilized it would look like:
388+
389+
```groovy
390+
texts = Channel.from("AWS", "Nextflow")
391+
392+
process hello {
393+
// directives
394+
// substitute the container image reference with a job-definition reference
395+
container "job-definition://say-hello"
396+
397+
// compute resources for the Batch Job
398+
cpus 1
399+
memory '512 MB'
400+
401+
input:
402+
val text from texts
403+
404+
output:
405+
file 'hello.txt'
406+
407+
"""
408+
echo "Hello $text" > hello.txt
409+
"""
410+
}
411+
```
412+
295413
### Running the workflow
296414

297415
To run a workflow you submit a `nextflow` Batch job to the appropriate Batch Job Queue via:

0 commit comments

Comments
 (0)