You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 9, 2023. It is now read-only.
Copy file name to clipboardExpand all lines: docs/core-env/setup-aws-batch.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,19 +14,19 @@ A [job definition](http://docs.aws.amazon.com/batch/latest/userguide/job_definit
14
14
15
15
Jobs are submitted to [job queues](http://docs.aws.amazon.com/batch/latest/userguide/job_queues.html) where they reside until they can be scheduled to run on Amazon EC2 instances within a compute environment. An AWS account can have multiple job queues, each with varying priority. This gives you the ability to closely align the consumption of compute resources with your organizational requirements.
16
16
17
-
[Compute environments](http://docs.aws.amazon.com/batch/latest/userguide/compute_environments.html) provision and manage your EC2 instances and other compute resources that are used to run your AWS Batch jobs. Job queues are mapped to one more compute environments and a given environment can also be mapped to one or more job queues. This many-to-many relationship is defined by the compute environment order and job queue priority properties.
17
+
[Compute environments](http://docs.aws.amazon.com/batch/latest/userguide/compute_environments.html) provision and manage your EC2 instances and other compute resources that are used to run your AWS Batch jobs. Job queues are mapped to one or more compute environments and a given environment can also be mapped to one or more job queues. This many-to-many relationship is defined by the compute environment order and job queue priority properties.
18
18
19
19
The following diagram shows a general overview of how the AWS Batch resources interact.
For more information, watch the [How AWS Batch Works](https://www.youtube.com/watch?v=T4aAWrGHmxQ) video.
24
24
25
-
### Requirements for AWS Batch Jobs
25
+
##AWS Batch Jobs Requirements
26
26
27
-
AWS Batch does not make assumptions on the structure and requirements that Jobs take with respect to inputs and outputs. Batch Jobs may take data streams, files, or only parameters as input, and produce the same variaty for output, inclusive of files, metadata changes, updates to databases, etc. Batch assumes that each application handles their own input/output requirements.
27
+
AWS Batch does not make assumptions on the structure and requirements that Jobs take with respect to inputs and outputs. Batch Jobs may take data streams, files, or only parameters as input, and produce the same variety for output, inclusive of files, metadata changes, updates to databases, etc. Batch assumes that each application handles their own input/output requirements.
28
28
29
-
A common pattern for bioinformatics tooling is that files such as genomic sequence data are both inputs and outputs to/from a process. Many bioinformatics tools have also been developed to run in traditional Linux-based compute clusters with shared filesystems, and are not necessarily optimized for cloud computing.
29
+
A common pattern for bioinformatics tooling is that files such as genomic sequence data are both inputs and outputs to/from a process. Many bioinformatics tools have also been developed to run in traditional Linux-based compute clusters with shared filesystems and are not necessarily optimized for cloud computing.
30
30
31
31
The set of common requirements for genomics on AWS Batch are:
32
32
@@ -36,19 +36,19 @@ The set of common requirements for genomics on AWS Batch are:
36
36
37
37
* Multitenancy:
38
38
39
-
Multiple container jobs may run concurrently on the same instance. In these situations, it’s essential that your job writes to a unique subdirectory.
39
+
Multiple container jobs may run concurrently on the same instance. In these situations, it is essential that your job writes to a unique subdirectory.
40
40
41
41
* Data cleanup:
42
42
43
43
As your jobs complete and write the output back to S3, it is a good idea to delete the scratch data generated by that job on your instance. This allows you to optimize for cost by reusing EC2 instances if there are jobs remaining in the queue, rather than terminating the EC2 instances.
44
44
45
-
## What you will need
45
+
## AWS Batch Environment
46
46
47
47
A complete AWS Batch environment consists of the following:
48
48
49
49
1. A Compute Environment that utilizes [EC2 Spot instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html) for cost-effective computing
50
50
2. A Compute Environment that utilizes EC2 on-demand (e.g. [public pricing](https://aws.amazon.com/ec2/pricing/on-demand/)) instances for high-priority work that can't risk job interruptions or delays due to insufficient Spot capacity.
51
-
3. A default Job Queue that utilizes the Spot compute environment first, but falls back to the on-demand compute environment if there is spare capacity already there.
51
+
3. A default Job Queue that utilizes the Spot compute environment first, but falls back to the on-demand compute environment if there is spare capacity available.
52
52
4. A high-priority Job Queue that leverages the on-demand and Spot CE's (in that order) and has higher priority than the default queue.
53
53
54
54
The CloudFormation template below will create all of the above.
0 commit comments