Skip to content
This repository was archived by the owner on Aug 9, 2023. It is now read-only.

Commit aafd09c

Browse files
authored
Merge pull request #154 from itzhapaz/develop/cdk-constructs
Develop/cdk constructs
2 parents 1826068 + 182e361 commit aafd09c

33 files changed

+1942
-1256
lines changed

src/aws-genomics-cdk/README.md

Lines changed: 75 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,18 @@
11
# Genomics Workflows on AWS - CDK code
22

3-
Contained herein is a CDK application for creating AWS resources for working with large-scale biomedical data - e.g. genomics.
3+
Contained herein is a CDK application for creating AWS resources for working
4+
with large-scale biomedical data - e.g. genomics.
45

5-
In order to deploy this CDK application, you'll need an environment with AWS CLI access and AWS CDK installed. A quick
6-
way yo get an environment for running this application is to launch [AWS Cloud9](https://aws.amazon.com/cloud9/).
6+
In order to deploy this CDK application, you'll need an environment with AWS
7+
CLI access and AWS CDK installed. A quick way to get an environment for running
8+
this application is to launch [AWS Cloud9](https://aws.amazon.com/cloud9/).
79

8-
AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you write, run, and debug your code
9-
with just a browser. It includes a code editor, debugger, and terminal. Cloud9 comes prepackaged with essential
10-
tools for popular programming languages, including JavaScript, Python, PHP, and more, so you don’t need to install
11-
files or configure your development machine to start new projects.
10+
AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets
11+
you write, run, and debug your code with just a browser. It includes a code
12+
editor, debugger, and terminal. Cloud9 comes prepackaged with essential
13+
tools for popular programming languages, including JavaScript, Python, PHP, and
14+
more, so you don’t need to install files or configure your development machine
15+
to start new projects.
1216

1317

1418
## Download
@@ -20,41 +24,70 @@ git clone https://github.com/aws-samples/aws-genomics-workflows.git
2024

2125
## Configure
2226

23-
This CDK application requires an S3 bucket and a VPC. The application can create them as part of the deployment or
24-
you could configure the application to use your own S3 bucket and/or existing VPC.
25-
26-
After cloning the repo, open, update, and save the application configuration file - `app.config.json`.
27-
28-
**accountID** - Your [AWS account id](https://docs.aws.amazon.com/IAM/latest/UserGuide/console_account-alias.html).
29-
**region** - The [AWS region](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html)
30-
you want to use for the deployment (e.g., us-east-1, us-west-2, etc.).
31-
**S3.existingBucket** - If you want to use an existing bucket, set this value to true, otherwise set it to false to
32-
create a new bucket.
27+
This CDK application requires an S3 bucket and a VPC. The application can
28+
create them as part of the deployment or you could configure the application to
29+
use your own S3 bucket and/or existing VPC.
30+
31+
After cloning the repo, open, update, and save the application configuration
32+
file - `app.config.json`.
33+
34+
**accountID** - Your
35+
[AWS account id](https://docs.aws.amazon.com/IAM/latest/UserGuide/console_account-alias.html).
36+
**region** - The
37+
[AWS region](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html)
38+
you want to use for the deployment (e.g., us-east-1, us-west-2, etc.).
39+
**projectName** - A name for the project that will be used as a prefix for the
40+
CDK stacks and constrcuts.
41+
**tags** - A list of key,value strings to use as tags for the AWS resources
42+
created by this app.
43+
**S3.existingBucket** - If you want to use an existing bucket, set this value
44+
to true, otherwise set it to false to create a new bucket.
3345
**S3.bucketName** - The bucket name to use or create.
34-
**VPC.createVPC** - If you want to create a new VPC, set this to true, otherwise set to false.
35-
**VPC.existingVPCName** - If you set the createVPC option to false, you must provide a valid VPC name to use in the
36-
same region of the deployment.
37-
**VPC.maxAZs** - The amount of availability zones to use when creating a new VPC.
38-
**VPC.cidr** - The [CIDR block](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing) for the new VPC.
39-
**VPC.cidrMask** - The [CIDR block subnet mask](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#Subnet_masks)
46+
**VPC.createVPC** - If you want to create a new VPC, set this to true,
47+
otherwise set to false.
48+
**VPC.VPCName** - The VPC name to use a create.
49+
**VPC.maxAZs** - The amount of availability zones to use when creating a new
50+
VPC.
51+
**VPC.cidr** - The
52+
[CIDR block](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing) for
53+
the new VPC.
54+
**VPC.cidrMask** - The
55+
[CIDR block subnet mask](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#Subnet_masks)
4056
for the new VPC.
41-
**batch.defaultVolumeSize** - The default EBS volume size in GiB to be attached to the EC2 instance under AWS Batch.
42-
**batch.spotMaxVCPUs** - The limit on vcpus when using [spot instances](https://aws.amazon.com/ec2/spot/).
57+
**batch.defaultVolumeSize** - The default EBS volume size in GiB to be attached
58+
to the EC2 instance under AWS Batch.
59+
**batch.spotMaxVCPUs** - The limit on vcpus when using
60+
[spot instances](https://aws.amazon.com/ec2/spot/).
4361
**batch.onDemendMaxVCPUs** - The limit on vcpus when using on-demand instances.
44-
**batch.instanceTypes** - The [EC2 instance types](https://aws.amazon.com/ec2/instance-types/) to use in AWS Batch.
45-
**stepFunctions.launchDemoPipeline** - If set to true, the application will deploy a demo pipeline using step fuinctions.
46-
**stepFunctions.jobDefinitions** - List of parametrs for the demo application bioinformatics tools.
62+
**batch.instanceTypes** - The
63+
[EC2 instance types](https://aws.amazon.com/ec2/instance-types/) to use in
64+
AWS Batch.
65+
**workflows** - A list of workflows that you would like to launch. There are
66+
demo workflows under the `lib/workflows` directory. To add a workflow, update
67+
the code in the `lib/aws-genomics-cdk-stack.ts` file. Look for the workflows
68+
section.
69+
4770
```
4871
{
4972
"accountID": "111111111111",
5073
"region": "us-west-2",
74+
"projectName": "genomics",
75+
"tags": [{
76+
"name": "Environment",
77+
"value": "production"
78+
},
79+
{
80+
"name": "Project",
81+
"value": "genomics-pipeline"
82+
}
83+
]
5184
"S3": {
5285
"existingBucket": true,
53-
"bucketName": ""
86+
"bucketName": "YOUR-BUCKET-NAME"
5487
},
5588
"VPC": {
5689
"createVPC": true,
57-
"existingVPCName": "",
90+
"VPCName": "genomics-vpc",
5891
"maxAZs": 2,
5992
"cidr": "10.0.0.0/16",
6093
"cidrMask": 24
@@ -79,44 +112,27 @@ for the new VPC.
79112
"c5.24xlarge"
80113
]
81114
},
82-
"stepFunctions": {
83-
"launchDemoPipeline": true,
84-
"jobDefinitions": {
85-
"fastqc": {
86-
"repository": "genomics/fastqc",
87-
"memoryLimit": 8000,
88-
"vcpus": 4,
89-
"spot": true,
90-
"retryAttempts":1,
91-
"timeout": 600
92-
},
93-
"minimap2": {
94-
"repository": "genomics/minimap2",
95-
"memoryLimit": 16000,
96-
"vcpus": 8,
97-
"spot": true,
98-
"retryAttempts":1,
99-
"timeout": 3600
100-
}
101-
}
102-
}
115+
"workflows": [{
116+
"name": "variantCalling",
117+
"spot": true
118+
}]
103119
}
104120
```
105121

106122
## Deploy
107123

108-
To deploy the CDK application, use the command line and make sure you are in the root folder of the CDK application.
109-
(`src/aws-genomics-cdk`).
124+
To deploy the CDK application, use the command line and make sure you are in
125+
the root folder of the CDK application (`src/aws-genomics-cdk`).
110126
First install the neccessary node.js modules
111127
```
112128
npm install
113129
```
114130

115131
Then deploy the application.
116132
```
117-
# The "--require-approval never" parameter will skip the question to approve specific resouce creation,
118-
# such as IAM roles. You can remove this parameter if you want to be prompted to approve creating these
119-
# resources.
133+
# The "--require-approval never" parameter will skip the question to approve
134+
# specific resouce creation, such as IAM roles. You can remove this parameter
135+
# if you want to be prompted to approve creating these resources.
120136
cdk deploy --all --require-approval never
121137
```
122138

@@ -129,7 +145,7 @@ cdk deploy --all --require-approval never
129145
| `lib/vpc/vpc-stack.ts` | An optional stack that will launch a VPC |
130146
| `lib/batch/batch-stack.ts` | An AWS Batch stack with 2 comnpute environments (spot and on demand) and 2 queues (default and high priority) |
131147
| `lib/batch/batch-iam-stack.ts` | An IAM stack with roles and policies required for running AWS Batch |
132-
| `lid/step-fuinctions/genomics-state-machine-stack.ts` | A step function demo of running a pipeline |
148+
| `llib/workflows` | A folder containing pipeline stacks |
133149

134150

135151
## Constructs
@@ -139,6 +155,6 @@ cdk deploy --all --require-approval never
139155
| `lib/batch/batch-compute-environmnet-construct.ts` | A construct for creating an [AWS Batch compute environment](https://docs.aws.amazon.com/batch/latest/userguide/compute_environments.html) |
140156
| `lib/batch/job-queue-construct.ts` | A construct for creating an [AWS Batch job queue](https://docs.aws.amazon.com/batch/latest/userguide/job_queues.html) |
141157
| `lib/batch/launch-template-construct.ts` | A construct for creating an [EC2 launch template](https://docs.aws.amazon.com/autoscaling/ec2/userguide/LaunchTemplates.html) |
142-
| `lib/step-functions/genomics-task-construct.ts` | A construct for creating a step function task that submits a batch job |
143-
| `lib/step-functions/job-definition-construct.ts` | A construct for creating an [AWS Batch job definition](https://docs.aws.amazon.com/batch/latest/userguide/job_definitions.html) to be used as a task in step functions |
158+
| `lib/workflows/genomics-task-construct.ts` | A construct for creating a step function task that submits a batch job |
159+
| `lib/workflows/job-definition-construct.ts` | A construct for creating an [AWS Batch job definition](https://docs.aws.amazon.com/batch/latest/userguide/job_definitions.html) to be used as a task in step functions |
144160

Lines changed: 17 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,23 @@
11
{
22
"accountID": "111111111111",
33
"region": "us-west-2",
4+
"projectName": "genomics",
5+
"tags": [{
6+
"name": "Environment",
7+
"value": "production"
8+
},
9+
{
10+
"name": "Project",
11+
"value": "genomics-pipeline"
12+
}
13+
],
414
"S3": {
515
"existingBucket": true,
6-
"bucketName": ""
16+
"bucketName": "YOUR-BUCKET-NAME"
717
},
818
"VPC": {
919
"createVPC": true,
10-
"existingVPCName": "",
20+
"VPCName": "genomics-vpc",
1121
"maxAZs": 2,
1222
"cidr": "10.0.0.0/16",
1323
"cidrMask": 24
@@ -32,25 +42,8 @@
3242
"c5.24xlarge"
3343
]
3444
},
35-
"stepFunctions": {
36-
"launchDemoPipeline": true,
37-
"jobDefinitions": {
38-
"fastqc": {
39-
"repository": "genomics/fastqc",
40-
"memoryLimit": 8000,
41-
"vcpus": 4,
42-
"spot": true,
43-
"retryAttempts":1,
44-
"timeout": 600
45-
},
46-
"minimap2": {
47-
"repository": "genomics/minimap2",
48-
"memoryLimit": 16000,
49-
"vcpus": 8,
50-
"spot": true,
51-
"retryAttempts":1,
52-
"timeout": 3600
53-
}
54-
}
55-
}
56-
}
45+
"workflows": [{
46+
"name": "variantCalling",
47+
"spot": true
48+
}]
49+
}

src/aws-genomics-cdk/assets/launch_template_user_data.txt

Lines changed: 67 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,72 @@ Content-Type: multipart/mixed; boundary="==BOUNDARY=="
44
--==BOUNDARY==
55
Content-Type: text/cloud-config; charset="us-ascii"
66

7+
#cloud-config
8+
repo_update: true
9+
repo_upgrade: security
10+
711
packages:
812
- jq
913
- btrfs-progs
1014
- sed
1115
- git
1216
- amazon-ssm-agent
1317
- unzip
18+
- amazon-cloudwatch-agent
19+
20+
write_files:
21+
- permissions: '0644'
22+
path: /opt/aws/amazon-cloudwatch-agent/etc/config.json
23+
content: |
24+
{
25+
"agent": {
26+
"logfile": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
27+
},
28+
"logs": {
29+
"logs_collected": {
30+
"files": {
31+
"collect_list": [
32+
{
33+
"file_path": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log",
34+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
35+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/amazon-cloudwatch-agent.log"
36+
},
37+
{
38+
"file_path": "/var/log/cloud-init.log",
39+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
40+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/cloud-init.log"
41+
},
42+
{
43+
"file_path": "/var/log/cloud-init-output.log",
44+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
45+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/cloud-init-output.log"
46+
},
47+
{
48+
"file_path": "/var/log/ecs/ecs-init.log",
49+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
50+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/ecs-init.log"
51+
},
52+
{
53+
"file_path": "/var/log/ecs/ecs-agent.log",
54+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
55+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/ecs-agent.log"
56+
},
57+
{
58+
"file_path": "/var/log/ecs/ecs-volume-plugin.log",
59+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
60+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/ecs-volume-plugin.log"
61+
}
62+
]
63+
}
64+
}
65+
}
66+
}
1467

1568
runcmd:
69+
70+
# start the amazon-cloudwatch-agent
71+
- /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/etc/config.json
72+
1673
# install aws-cli v2 and copy the static binary in an easy to find location for bind-mounts into containers
1774
- curl -s "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "/tmp/awscliv2.zip"
1875
- unzip -q /tmp/awscliv2.zip -d /tmp
@@ -24,11 +81,19 @@ runcmd:
2481
- mkdir -p /opt/aws-cli/bin
2582
- cp -a $(dirname $(find /usr/local/aws-cli -name 'aws' -type f))/. /opt/aws-cli/bin/
2683

84+
# set environment variables for provisioning
85+
- export GWFCORE_NAMESPACE=${Namespace}
86+
- export INSTALLED_ARTIFACTS_S3_ROOT_URL=$(aws ssm get-parameter --name /gwfcore/${Namespace}/installed-artifacts/s3-root-url --query 'Parameter.Value' --output text)
2787

2888
# enable ecs spot instance draining
2989
- echo ECS_ENABLE_SPOT_INSTANCE_DRAINING=true >> /etc/ecs/ecs.config
3090

31-
- systemctl enable amazon-ssm-agent
32-
- systemctl start amazon-ssm-agent
91+
# pull docker images only if missing
92+
- echo ECS_IMAGE_PULL_BEHAVIOR=prefer-cached >> /etc/ecs/ecs.config
93+
94+
- cd /opt
95+
- aws s3 sync $INSTALLED_ARTIFACTS_S3_ROOT_URL/ecs-additions ./ecs-additions
96+
- chmod a+x /opt/ecs-additions/provision.sh
97+
- /opt/ecs-additions/provision.sh
3398

3499
--==BOUNDARY==--
Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,23 @@
11
#!/usr/bin/env node
2-
import 'source-map-support/register';
3-
import * as cdk from '@aws-cdk/core';
4-
import { AwsGenomicsCdkStack } from '../lib/aws-genomics-cdk-stack';
2+
import "source-map-support/register";
3+
import * as cdk from "@aws-cdk/core";
4+
import { AwsGenomicsCdkStack } from "../lib/aws-genomics-cdk-stack";
55
import * as config from "../app.config.json";
66

77
const env = {
8-
account: process.env.CDK_DEFAULT_ACCOUNT ?? config.accountID,
9-
region: process.env.CDK_DEFAULT_REGION ?? config.region
10-
}
8+
account: process.env.CDK_DEFAULT_ACCOUNT ?? config.accountID,
9+
region: process.env.CDK_DEFAULT_REGION ?? config.region,
10+
};
1111

1212
const app = new cdk.App();
13-
new AwsGenomicsCdkStack(app, 'AwsGenomicsCdkStack', {env: env});
13+
const genomicsStack = new AwsGenomicsCdkStack(
14+
app,
15+
`${config.projectName}CdkStack`,
16+
{
17+
env: env,
18+
}
19+
);
20+
21+
for (let i = 0; i < config.tags.length; i++) {
22+
cdk.Tags.of(genomicsStack).add(config.tags[i].name, config.tags[i].value);
23+
}

0 commit comments

Comments
 (0)