|
1 | | -# Welcome to your CDK TypeScript project! |
| 1 | +# Genomics Workflows on AWS - CDK code |
2 | 2 |
|
3 | | -This is a blank project for TypeScript development with CDK. |
| 3 | +Contained herein is a CDK application for creating AWS resources for working with large-scale biomedical data - e.g. genomics. |
4 | 4 |
|
5 | | -The `cdk.json` file tells the CDK Toolkit how to execute your app. |
| 5 | +In order to deploy this CDK application, you'll need an environment with AWS CLI access and AWS CDK installed. A quick |
| 6 | +way yo get an environment for running this application is to launch [AWS Cloud9](https://aws.amazon.com/cloud9/). |
6 | 7 |
|
7 | | -## Useful commands |
| 8 | +AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you write, run, and debug your code |
| 9 | +with just a browser. It includes a code editor, debugger, and terminal. Cloud9 comes prepackaged with essential |
| 10 | +tools for popular programming languages, including JavaScript, Python, PHP, and more, so you don’t need to install |
| 11 | +files or configure your development machine to start new projects. |
| 12 | + |
| 13 | + |
| 14 | +## Download |
| 15 | + |
| 16 | +Clone the repo to your local environment / Cloud9 environment. |
| 17 | +``` |
| 18 | +git clone https://github.com/aws-samples/aws-genomics-workflows.git |
| 19 | +``` |
| 20 | + |
| 21 | +## Configure |
| 22 | + |
| 23 | +This CDK application requires an S3 bucket and a VPC. The application can create them as part of the deployment or |
| 24 | +you could configure the application to use your own S3 bucket and/or existing VPC. |
| 25 | + |
| 26 | +After cloning the repo, open, update, and save the application configuration file - `app.config.json`. |
| 27 | + |
| 28 | +**accountID** - Your [AWS account id](https://docs.aws.amazon.com/IAM/latest/UserGuide/console_account-alias.html). |
| 29 | +**region** - The [AWS region](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html) |
| 30 | +you want to use for the deployment (e.g., us-east-1, us-west-2, etc.). |
| 31 | +**S3.existingBucket** - If you want to use an existing bucket, set this value to true, otherwise set it to false to |
| 32 | +create a new bucket. |
| 33 | +**S3.bucketName** - The bucket name to use or create. |
| 34 | +**VPC.createVPC** - If you want to create a new VPC, set this to true, otherwise set to false. |
| 35 | +**VPC.existingVPCName** - If you set the createVPC option to false, you must provide a valid VPC name to use in the |
| 36 | +same region of the deployment. |
| 37 | +**VPC.maxAZs** - The amount of availability zones to use when creating a new VPC. |
| 38 | +**VPC.cidr** - The [CIDR block](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing) for the new VPC. |
| 39 | +**VPC.cidrMask** - The [CIDR block subnet mask](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#Subnet_masks) |
| 40 | +for the new VPC. |
| 41 | +**batch.defaultVolumeSize** - The default EBS volume size in GiB to be attached to the EC2 instance under AWS Batch. |
| 42 | +**batch.spotMaxVCPUs** - The limit on vcpus when using [spot instances](https://aws.amazon.com/ec2/spot/). |
| 43 | +**batch.onDemendMaxVCPUs** - The limit on vcpus when using on-demand instances. |
| 44 | +**batch.instanceTypes** - The [EC2 instance types](https://aws.amazon.com/ec2/instance-types/) to use in AWS Batch. |
| 45 | +**stepFunctions.launchDemoPipeline** - If set to true, the application will deploy a demo pipeline using step fuinctions. |
| 46 | +**stepFunctions.jobDefinitions** - List of parametrs for the demo application bioinformatics tools. |
| 47 | +``` |
| 48 | +{ |
| 49 | + "accountID": "111111111111", |
| 50 | + "region": "us-west-2", |
| 51 | + "S3": { |
| 52 | + "existingBucket": true, |
| 53 | + "bucketName": "" |
| 54 | + }, |
| 55 | + "VPC": { |
| 56 | + "createVPC": true, |
| 57 | + "existingVPCName": "", |
| 58 | + "maxAZs": 2, |
| 59 | + "cidr": "10.0.0.0/16", |
| 60 | + "cidrMask": 24 |
| 61 | + }, |
| 62 | + "batch": { |
| 63 | + "defaultVolumeSize": 100, |
| 64 | + "spotMaxVCPUs": 128, |
| 65 | + "onDemendMaxVCPUs": 128, |
| 66 | + "instanceTypes": [ |
| 67 | + "c4.large", |
| 68 | + "c4.xlarge", |
| 69 | + "c4.2xlarge", |
| 70 | + "c4.4xlarge", |
| 71 | + "c4.8xlarge", |
| 72 | + "c5.large", |
| 73 | + "c5.xlarge", |
| 74 | + "c5.2xlarge", |
| 75 | + "c5.4xlarge", |
| 76 | + "c5.9xlarge", |
| 77 | + "c5.12xlarge", |
| 78 | + "c5.18xlarge", |
| 79 | + "c5.24xlarge" |
| 80 | + ] |
| 81 | + }, |
| 82 | + "stepFunctions": { |
| 83 | + "launchDemoPipeline": true, |
| 84 | + "jobDefinitions": { |
| 85 | + "fastqc": { |
| 86 | + "repository": "genomics/fastqc", |
| 87 | + "memoryLimit": 8000, |
| 88 | + "vcpus": 4, |
| 89 | + "spot": true, |
| 90 | + "retryAttempts":1, |
| 91 | + "timeout": 600 |
| 92 | + }, |
| 93 | + "minimap2": { |
| 94 | + "repository": "genomics/minimap2", |
| 95 | + "memoryLimit": 16000, |
| 96 | + "vcpus": 8, |
| 97 | + "spot": true, |
| 98 | + "retryAttempts":1, |
| 99 | + "timeout": 3600 |
| 100 | + } |
| 101 | + } |
| 102 | + } |
| 103 | +} |
| 104 | +``` |
| 105 | + |
| 106 | +## Deploy |
| 107 | + |
| 108 | +To deploy the CDK application, use the command line and make sure you are in the root folder of the CDK application. |
| 109 | +(`src/aws-genomics-cdk`). |
| 110 | +First install the neccessary node.js modules |
| 111 | +``` |
| 112 | +npm install |
| 113 | +``` |
| 114 | + |
| 115 | +Then deploy the application. |
| 116 | +``` |
| 117 | +# The "--require-approval never" parameter will skip the question to approve specific resouce creation, |
| 118 | +# such as IAM roles. You can remove this parameter if you want to be prompted to approve creating these |
| 119 | +# resources. |
| 120 | +cdk deploy --all --require-approval never |
| 121 | +``` |
| 122 | + |
| 123 | + |
| 124 | +## Stacks |
| 125 | + |
| 126 | +| File | Description | |
| 127 | +| :--- | :---------- | |
| 128 | +| `lib/aws-genomics-cdk-stack.ts` | The main stack that initialize the rest of the stacks | |
| 129 | +| `lib/vpc/vpc-stack.ts` | An optional stack that will launch a VPC | |
| 130 | +| `lib/batch/batch-stack.ts` | An AWS Batch stack with 2 comnpute environments (spot and on demand) and 2 queues (default and high priority) | |
| 131 | +| `lib/batch/batch-iam-stack.ts` | An IAM stack with roles and policies required for running AWS Batch | |
| 132 | +| `lid/step-fuinctions/genomics-state-machine-stack.ts` | A step function demo of running a pipeline | |
| 133 | + |
| 134 | + |
| 135 | +## Constructs |
| 136 | + |
| 137 | +| File | Description | |
| 138 | +| :--- | :---------- | |
| 139 | +| `lib/batch/batch-compute-environmnet-construct.ts` | A construct for creating an [AWS Batch compute environment](https://docs.aws.amazon.com/batch/latest/userguide/compute_environments.html) | |
| 140 | +| `lib/batch/job-queue-construct.ts` | A construct for creating an [AWS Batch job queue](https://docs.aws.amazon.com/batch/latest/userguide/job_queues.html) | |
| 141 | +| `lib/batch/launch-template-construct.ts` | A construct for creating an [EC2 launch template](https://docs.aws.amazon.com/autoscaling/ec2/userguide/LaunchTemplates.html) | |
| 142 | +| `lib/step-functions/genomics-task-construct.ts` | A construct for creating a step function task that submits a batch job | |
| 143 | +| `lib/step-functions/job-definition-construct.ts` | A construct for creating an [AWS Batch job definition](https://docs.aws.amazon.com/batch/latest/userguide/job_definitions.html) to be used as a task in step functions | |
8 | 144 |
|
9 | | - * `npm run build` compile typescript to js |
10 | | - * `npm run watch` watch for changes and compile |
11 | | - * `npm run test` perform the jest unit tests |
12 | | - * `cdk deploy` deploy this stack to your default AWS account/region |
13 | | - * `cdk diff` compare deployed stack with current state |
14 | | - * `cdk synth` emits the synthesized CloudFormation template |
|
0 commit comments