Skip to content
This repository was archived by the owner on Aug 9, 2023. It is now read-only.

Commit c5c43ee

Browse files
authored
Merge pull request #157 from aws-samples/develop/cdk-constructs
Develop/cdk constructs
2 parents 043728e + aafd09c commit c5c43ee

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+10163
-0
lines changed

src/aws-genomics-cdk/.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
*.js
2+
!jest.config.js
3+
*.d.ts
4+
node_modules
5+
6+
# CDK asset staging directory
7+
.cdk.staging
8+
cdk.out
9+
cdk.context.json

src/aws-genomics-cdk/.npmignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
*.ts
2+
!*.d.ts
3+
4+
# CDK asset staging directory
5+
.cdk.staging
6+
cdk.out

src/aws-genomics-cdk/README.md

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
# Genomics Workflows on AWS - CDK code
2+
3+
Contained herein is a CDK application for creating AWS resources for working
4+
with large-scale biomedical data - e.g. genomics.
5+
6+
In order to deploy this CDK application, you'll need an environment with AWS
7+
CLI access and AWS CDK installed. A quick way to get an environment for running
8+
this application is to launch [AWS Cloud9](https://aws.amazon.com/cloud9/).
9+
10+
AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets
11+
you write, run, and debug your code with just a browser. It includes a code
12+
editor, debugger, and terminal. Cloud9 comes prepackaged with essential
13+
tools for popular programming languages, including JavaScript, Python, PHP, and
14+
more, so you don’t need to install files or configure your development machine
15+
to start new projects.
16+
17+
18+
## Download
19+
20+
Clone the repo to your local environment / Cloud9 environment.
21+
```
22+
git clone https://github.com/aws-samples/aws-genomics-workflows.git
23+
```
24+
25+
## Configure
26+
27+
This CDK application requires an S3 bucket and a VPC. The application can
28+
create them as part of the deployment or you could configure the application to
29+
use your own S3 bucket and/or existing VPC.
30+
31+
After cloning the repo, open, update, and save the application configuration
32+
file - `app.config.json`.
33+
34+
**accountID** - Your
35+
[AWS account id](https://docs.aws.amazon.com/IAM/latest/UserGuide/console_account-alias.html).
36+
**region** - The
37+
[AWS region](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html)
38+
you want to use for the deployment (e.g., us-east-1, us-west-2, etc.).
39+
**projectName** - A name for the project that will be used as a prefix for the
40+
CDK stacks and constrcuts.
41+
**tags** - A list of key,value strings to use as tags for the AWS resources
42+
created by this app.
43+
**S3.existingBucket** - If you want to use an existing bucket, set this value
44+
to true, otherwise set it to false to create a new bucket.
45+
**S3.bucketName** - The bucket name to use or create.
46+
**VPC.createVPC** - If you want to create a new VPC, set this to true,
47+
otherwise set to false.
48+
**VPC.VPCName** - The VPC name to use a create.
49+
**VPC.maxAZs** - The amount of availability zones to use when creating a new
50+
VPC.
51+
**VPC.cidr** - The
52+
[CIDR block](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing) for
53+
the new VPC.
54+
**VPC.cidrMask** - The
55+
[CIDR block subnet mask](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#Subnet_masks)
56+
for the new VPC.
57+
**batch.defaultVolumeSize** - The default EBS volume size in GiB to be attached
58+
to the EC2 instance under AWS Batch.
59+
**batch.spotMaxVCPUs** - The limit on vcpus when using
60+
[spot instances](https://aws.amazon.com/ec2/spot/).
61+
**batch.onDemendMaxVCPUs** - The limit on vcpus when using on-demand instances.
62+
**batch.instanceTypes** - The
63+
[EC2 instance types](https://aws.amazon.com/ec2/instance-types/) to use in
64+
AWS Batch.
65+
**workflows** - A list of workflows that you would like to launch. There are
66+
demo workflows under the `lib/workflows` directory. To add a workflow, update
67+
the code in the `lib/aws-genomics-cdk-stack.ts` file. Look for the workflows
68+
section.
69+
70+
```
71+
{
72+
"accountID": "111111111111",
73+
"region": "us-west-2",
74+
"projectName": "genomics",
75+
"tags": [{
76+
"name": "Environment",
77+
"value": "production"
78+
},
79+
{
80+
"name": "Project",
81+
"value": "genomics-pipeline"
82+
}
83+
]
84+
"S3": {
85+
"existingBucket": true,
86+
"bucketName": "YOUR-BUCKET-NAME"
87+
},
88+
"VPC": {
89+
"createVPC": true,
90+
"VPCName": "genomics-vpc",
91+
"maxAZs": 2,
92+
"cidr": "10.0.0.0/16",
93+
"cidrMask": 24
94+
},
95+
"batch": {
96+
"defaultVolumeSize": 100,
97+
"spotMaxVCPUs": 128,
98+
"onDemendMaxVCPUs": 128,
99+
"instanceTypes": [
100+
"c4.large",
101+
"c4.xlarge",
102+
"c4.2xlarge",
103+
"c4.4xlarge",
104+
"c4.8xlarge",
105+
"c5.large",
106+
"c5.xlarge",
107+
"c5.2xlarge",
108+
"c5.4xlarge",
109+
"c5.9xlarge",
110+
"c5.12xlarge",
111+
"c5.18xlarge",
112+
"c5.24xlarge"
113+
]
114+
},
115+
"workflows": [{
116+
"name": "variantCalling",
117+
"spot": true
118+
}]
119+
}
120+
```
121+
122+
## Deploy
123+
124+
To deploy the CDK application, use the command line and make sure you are in
125+
the root folder of the CDK application (`src/aws-genomics-cdk`).
126+
First install the neccessary node.js modules
127+
```
128+
npm install
129+
```
130+
131+
Then deploy the application.
132+
```
133+
# The "--require-approval never" parameter will skip the question to approve
134+
# specific resouce creation, such as IAM roles. You can remove this parameter
135+
# if you want to be prompted to approve creating these resources.
136+
cdk deploy --all --require-approval never
137+
```
138+
139+
140+
## Stacks
141+
142+
| File | Description |
143+
| :--- | :---------- |
144+
| `lib/aws-genomics-cdk-stack.ts` | The main stack that initialize the rest of the stacks |
145+
| `lib/vpc/vpc-stack.ts` | An optional stack that will launch a VPC |
146+
| `lib/batch/batch-stack.ts` | An AWS Batch stack with 2 comnpute environments (spot and on demand) and 2 queues (default and high priority) |
147+
| `lib/batch/batch-iam-stack.ts` | An IAM stack with roles and policies required for running AWS Batch |
148+
| `llib/workflows` | A folder containing pipeline stacks |
149+
150+
151+
## Constructs
152+
153+
| File | Description |
154+
| :--- | :---------- |
155+
| `lib/batch/batch-compute-environmnet-construct.ts` | A construct for creating an [AWS Batch compute environment](https://docs.aws.amazon.com/batch/latest/userguide/compute_environments.html) |
156+
| `lib/batch/job-queue-construct.ts` | A construct for creating an [AWS Batch job queue](https://docs.aws.amazon.com/batch/latest/userguide/job_queues.html) |
157+
| `lib/batch/launch-template-construct.ts` | A construct for creating an [EC2 launch template](https://docs.aws.amazon.com/autoscaling/ec2/userguide/LaunchTemplates.html) |
158+
| `lib/workflows/genomics-task-construct.ts` | A construct for creating a step function task that submits a batch job |
159+
| `lib/workflows/job-definition-construct.ts` | A construct for creating an [AWS Batch job definition](https://docs.aws.amazon.com/batch/latest/userguide/job_definitions.html) to be used as a task in step functions |
160+
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
{
2+
"accountID": "111111111111",
3+
"region": "us-west-2",
4+
"projectName": "genomics",
5+
"tags": [{
6+
"name": "Environment",
7+
"value": "production"
8+
},
9+
{
10+
"name": "Project",
11+
"value": "genomics-pipeline"
12+
}
13+
],
14+
"S3": {
15+
"existingBucket": true,
16+
"bucketName": "YOUR-BUCKET-NAME"
17+
},
18+
"VPC": {
19+
"createVPC": true,
20+
"VPCName": "genomics-vpc",
21+
"maxAZs": 2,
22+
"cidr": "10.0.0.0/16",
23+
"cidrMask": 24
24+
},
25+
"batch": {
26+
"defaultVolumeSize": 100,
27+
"spotMaxVCPUs": 128,
28+
"onDemendMaxVCPUs": 128,
29+
"instanceTypes": [
30+
"c4.large",
31+
"c4.xlarge",
32+
"c4.2xlarge",
33+
"c4.4xlarge",
34+
"c4.8xlarge",
35+
"c5.large",
36+
"c5.xlarge",
37+
"c5.2xlarge",
38+
"c5.4xlarge",
39+
"c5.9xlarge",
40+
"c5.12xlarge",
41+
"c5.18xlarge",
42+
"c5.24xlarge"
43+
]
44+
},
45+
"workflows": [{
46+
"name": "variantCalling",
47+
"spot": true
48+
}]
49+
}
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
{
2+
"Version": "2012-10-17",
3+
"Statement": [
4+
{
5+
"Effect": "Deny",
6+
"Action": [
7+
"s3:Delete*",
8+
"s3:PutBucket*"
9+
],
10+
"Resource": [
11+
"arn:aws:s3:::BUCKET_NAME"
12+
]
13+
},
14+
{
15+
"Effect": "Allow",
16+
"Action": [
17+
"s3:ListBucket*"
18+
],
19+
"Resource": [
20+
"arn:aws:s3:::BUCKET_NAME"
21+
]
22+
},
23+
{
24+
"Effect": "Allow",
25+
"Action": [
26+
"s3:*"
27+
],
28+
"Resource": [
29+
"arn:aws:s3:::BUCKET_NAME/*"
30+
]
31+
}
32+
]
33+
}
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
MIME-Version: 1.0
2+
Content-Type: multipart/mixed; boundary="==BOUNDARY=="
3+
4+
--==BOUNDARY==
5+
Content-Type: text/cloud-config; charset="us-ascii"
6+
7+
#cloud-config
8+
repo_update: true
9+
repo_upgrade: security
10+
11+
packages:
12+
- jq
13+
- btrfs-progs
14+
- sed
15+
- git
16+
- amazon-ssm-agent
17+
- unzip
18+
- amazon-cloudwatch-agent
19+
20+
write_files:
21+
- permissions: '0644'
22+
path: /opt/aws/amazon-cloudwatch-agent/etc/config.json
23+
content: |
24+
{
25+
"agent": {
26+
"logfile": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
27+
},
28+
"logs": {
29+
"logs_collected": {
30+
"files": {
31+
"collect_list": [
32+
{
33+
"file_path": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log",
34+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
35+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/amazon-cloudwatch-agent.log"
36+
},
37+
{
38+
"file_path": "/var/log/cloud-init.log",
39+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
40+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/cloud-init.log"
41+
},
42+
{
43+
"file_path": "/var/log/cloud-init-output.log",
44+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
45+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/cloud-init-output.log"
46+
},
47+
{
48+
"file_path": "/var/log/ecs/ecs-init.log",
49+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
50+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/ecs-init.log"
51+
},
52+
{
53+
"file_path": "/var/log/ecs/ecs-agent.log",
54+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
55+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/ecs-agent.log"
56+
},
57+
{
58+
"file_path": "/var/log/ecs/ecs-volume-plugin.log",
59+
"log_group_name": "/aws/ecs/container-instance/${Namespace}",
60+
"log_stream_name": "/aws/ecs/container-instance/${Namespace}/{instance_id}/ecs-volume-plugin.log"
61+
}
62+
]
63+
}
64+
}
65+
}
66+
}
67+
68+
runcmd:
69+
70+
# start the amazon-cloudwatch-agent
71+
- /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/etc/config.json
72+
73+
# install aws-cli v2 and copy the static binary in an easy to find location for bind-mounts into containers
74+
- curl -s "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "/tmp/awscliv2.zip"
75+
- unzip -q /tmp/awscliv2.zip -d /tmp
76+
- /tmp/aws/install -b /usr/bin
77+
78+
# check that the aws-cli was actually installed. if not shutdown (terminate) the instance
79+
- command -v aws || shutdown -P now
80+
81+
- mkdir -p /opt/aws-cli/bin
82+
- cp -a $(dirname $(find /usr/local/aws-cli -name 'aws' -type f))/. /opt/aws-cli/bin/
83+
84+
# set environment variables for provisioning
85+
- export GWFCORE_NAMESPACE=${Namespace}
86+
- export INSTALLED_ARTIFACTS_S3_ROOT_URL=$(aws ssm get-parameter --name /gwfcore/${Namespace}/installed-artifacts/s3-root-url --query 'Parameter.Value' --output text)
87+
88+
# enable ecs spot instance draining
89+
- echo ECS_ENABLE_SPOT_INSTANCE_DRAINING=true >> /etc/ecs/ecs.config
90+
91+
# pull docker images only if missing
92+
- echo ECS_IMAGE_PULL_BEHAVIOR=prefer-cached >> /etc/ecs/ecs.config
93+
94+
- cd /opt
95+
- aws s3 sync $INSTALLED_ARTIFACTS_S3_ROOT_URL/ecs-additions ./ecs-additions
96+
- chmod a+x /opt/ecs-additions/provision.sh
97+
- /opt/ecs-additions/provision.sh
98+
99+
--==BOUNDARY==--
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#!/usr/bin/env node
2+
import "source-map-support/register";
3+
import * as cdk from "@aws-cdk/core";
4+
import { AwsGenomicsCdkStack } from "../lib/aws-genomics-cdk-stack";
5+
import * as config from "../app.config.json";
6+
7+
const env = {
8+
account: process.env.CDK_DEFAULT_ACCOUNT ?? config.accountID,
9+
region: process.env.CDK_DEFAULT_REGION ?? config.region,
10+
};
11+
12+
const app = new cdk.App();
13+
const genomicsStack = new AwsGenomicsCdkStack(
14+
app,
15+
`${config.projectName}CdkStack`,
16+
{
17+
env: env,
18+
}
19+
);
20+
21+
for (let i = 0; i < config.tags.length; i++) {
22+
cdk.Tags.of(genomicsStack).add(config.tags[i].name, config.tags[i].value);
23+
}

0 commit comments

Comments
 (0)