Skip to content
This repository was archived by the owner on Aug 9, 2023. It is now read-only.

Commit 660d044

Browse files
committed
CDK instructions
1 parent 655f3b4 commit 660d044

File tree

1 file changed

+140
-10
lines changed

1 file changed

+140
-10
lines changed

src/aws-genomics-cdk/README.md

Lines changed: 140 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,144 @@
1-
# Welcome to your CDK TypeScript project!
1+
# Genomics Workflows on AWS - CDK code
22

3-
This is a blank project for TypeScript development with CDK.
3+
Contained herein is a CDK application for creating AWS resources for working with large-scale biomedical data - e.g. genomics.
44

5-
The `cdk.json` file tells the CDK Toolkit how to execute your app.
5+
In order to deploy this CDK application, you'll need an environment with AWS CLI access and AWS CDK installed. A quick
6+
way yo get an environment for running this application is to launch [AWS Cloud9](https://aws.amazon.com/cloud9/).
67

7-
## Useful commands
8+
AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you write, run, and debug your code
9+
with just a browser. It includes a code editor, debugger, and terminal. Cloud9 comes prepackaged with essential
10+
tools for popular programming languages, including JavaScript, Python, PHP, and more, so you don’t need to install
11+
files or configure your development machine to start new projects.
12+
13+
14+
## Download
15+
16+
Clone the repo to your local environment / Cloud9 environment.
17+
```
18+
git clone https://github.com/aws-samples/aws-genomics-workflows.git
19+
```
20+
21+
## Configure
22+
23+
This CDK application requires an S3 bucket and a VPC. The application can create them as part of the deployment or
24+
you could configure the application to use your own S3 bucket and/or existing VPC.
25+
26+
After cloning the repo, open, update, and save the application configuration file - `app.config.json`.
27+
28+
**accountID** - Your [AWS account id](https://docs.aws.amazon.com/IAM/latest/UserGuide/console_account-alias.html).
29+
**region** - The [AWS region](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html)
30+
you want to use for the deployment (e.g., us-east-1, us-west-2, etc.).
31+
**S3.existingBucket** - If you want to use an existing bucket, set this value to true, otherwise set it to false to
32+
create a new bucket.
33+
**S3.bucketName** - The bucket name to use or create.
34+
**VPC.createVPC** - If you want to create a new VPC, set this to true, otherwise set to false.
35+
**VPC.existingVPCName** - If you set the createVPC option to false, you must provide a valid VPC name to use in the
36+
same region of the deployment.
37+
**VPC.maxAZs** - The amount of availability zones to use when creating a new VPC.
38+
**VPC.cidr** - The [CIDR block](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing) for the new VPC.
39+
**VPC.cidrMask** - The [CIDR block subnet mask](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#Subnet_masks)
40+
for the new VPC.
41+
**batch.defaultVolumeSize** - The default EBS volume size in GiB to be attached to the EC2 instance under AWS Batch.
42+
**batch.spotMaxVCPUs** - The limit on vcpus when using [spot instances](https://aws.amazon.com/ec2/spot/).
43+
**batch.onDemendMaxVCPUs** - The limit on vcpus when using on-demand instances.
44+
**batch.instanceTypes** - The [EC2 instance types](https://aws.amazon.com/ec2/instance-types/) to use in AWS Batch.
45+
**stepFunctions.launchDemoPipeline** - If set to true, the application will deploy a demo pipeline using step fuinctions.
46+
**stepFunctions.jobDefinitions** - List of parametrs for the demo application bioinformatics tools.
47+
```
48+
{
49+
"accountID": "111111111111",
50+
"region": "us-west-2",
51+
"S3": {
52+
"existingBucket": true,
53+
"bucketName": ""
54+
},
55+
"VPC": {
56+
"createVPC": true,
57+
"existingVPCName": "",
58+
"maxAZs": 2,
59+
"cidr": "10.0.0.0/16",
60+
"cidrMask": 24
61+
},
62+
"batch": {
63+
"defaultVolumeSize": 100,
64+
"spotMaxVCPUs": 128,
65+
"onDemendMaxVCPUs": 128,
66+
"instanceTypes": [
67+
"c4.large",
68+
"c4.xlarge",
69+
"c4.2xlarge",
70+
"c4.4xlarge",
71+
"c4.8xlarge",
72+
"c5.large",
73+
"c5.xlarge",
74+
"c5.2xlarge",
75+
"c5.4xlarge",
76+
"c5.9xlarge",
77+
"c5.12xlarge",
78+
"c5.18xlarge",
79+
"c5.24xlarge"
80+
]
81+
},
82+
"stepFunctions": {
83+
"launchDemoPipeline": true,
84+
"jobDefinitions": {
85+
"fastqc": {
86+
"repository": "genomics/fastqc",
87+
"memoryLimit": 8000,
88+
"vcpus": 4,
89+
"spot": true,
90+
"retryAttempts":1,
91+
"timeout": 600
92+
},
93+
"minimap2": {
94+
"repository": "genomics/minimap2",
95+
"memoryLimit": 16000,
96+
"vcpus": 8,
97+
"spot": true,
98+
"retryAttempts":1,
99+
"timeout": 3600
100+
}
101+
}
102+
}
103+
}
104+
```
105+
106+
## Deploy
107+
108+
To deploy the CDK application, use the command line and make sure you are in the root folder of the CDK application.
109+
(`src/aws-genomics-cdk`).
110+
First install the neccessary node.js modules
111+
```
112+
npm install
113+
```
114+
115+
Then deploy the application.
116+
```
117+
# The "--require-approval never" parameter will skip the question to approve specific resouce creation,
118+
# such as IAM roles. You can remove this parameter if you want to be prompted to approve creating these
119+
# resources.
120+
cdk deploy --all --require-approval never
121+
```
122+
123+
124+
## Stacks
125+
126+
| File | Description |
127+
| :--- | :---------- |
128+
| `lib/aws-genomics-cdk-stack.ts` | The main stack that initialize the rest of the stacks |
129+
| `lib/vpc/vpc-stack.ts` | An optional stack that will launch a VPC |
130+
| `lib/batch/batch-stack.ts` | An AWS Batch stack with 2 comnpute environments (spot and on demand) and 2 queues (default and high priority) |
131+
| `lib/batch/batch-iam-stack.ts` | An IAM stack with roles and policies required for running AWS Batch |
132+
| `lid/step-fuinctions/genomics-state-machine-stack.ts` | A step function demo of running a pipeline |
133+
134+
135+
## Constructs
136+
137+
| File | Description |
138+
| :--- | :---------- |
139+
| `lib/batch/batch-compute-environmnet-construct.ts` | A construct for creating an [AWS Batch compute environment](https://docs.aws.amazon.com/batch/latest/userguide/compute_environments.html) |
140+
| `lib/batch/job-queue-construct.ts` | A construct for creating an [AWS Batch job queue](https://docs.aws.amazon.com/batch/latest/userguide/job_queues.html) |
141+
| `lib/batch/launch-template-construct.ts` | A construct for creating an [EC2 launch template](https://docs.aws.amazon.com/autoscaling/ec2/userguide/LaunchTemplates.html) |
142+
| `lib/step-functions/genomics-task-construct.ts` | A construct for creating a step function task that submits a batch job |
143+
| `lib/step-functions/job-definition-construct.ts` | A construct for creating an [AWS Batch job definition](https://docs.aws.amazon.com/batch/latest/userguide/job_definitions.html) to be used as a task in step functions |
8144

9-
* `npm run build` compile typescript to js
10-
* `npm run watch` watch for changes and compile
11-
* `npm run test` perform the jest unit tests
12-
* `cdk deploy` deploy this stack to your default AWS account/region
13-
* `cdk diff` compare deployed stack with current state
14-
* `cdk synth` emits the synthesized CloudFormation template

0 commit comments

Comments
 (0)