Skip to content

Commit 92a53eb

Browse files
author
EC2 Default User
committed
updating install instructions.
1 parent 7342485 commit 92a53eb

File tree

1 file changed

+19
-21
lines changed

1 file changed

+19
-21
lines changed

README.md

Lines changed: 19 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,47 +2,45 @@
22

33
Companion code for upcoming AWS blogpost on enrolling chembl and opentargets into a data lake on AWS<br/>
44

5-
<br/>
5+
<h2 id='HPG9CAlPR6i'>To install this in your own AWS account:</h2>
66

7-
To install this in your own AWS account:<br/>
7+
Your local machine needs to have the AWS CLI installed on your machine along with IAM permissions setup (through IAM role or .aws/credentials file). I like to use Cloud9 as my IDE as it comes with both of those already setup for me.<br/>
88

99
<br/>
1010

11-
<div style="" data-section-style='6' class=""><ul id='HPG9CAdvuXu'><li id='HPG9CAsuX2r' class='' value='1'>Clone this repo
12-
13-
<br/></li></ul></div><pre id='HPG9CAKfUT3'>git clone <a href="https://github.com/paulu-aws/chembl-opentargets-data-lake-example.git">https://github.com/paulu-aws/chembl-opentargets-data-lake-example.git</a></pre>
11+
Run the following commands<br/>
1412

15-
<div style="" data-section-style='6' class="list-numbering-continue"><ul id='HPG9CAPCYqe'><li id='HPG9CASYAsv' class='' value='1'>Install the CDK dependencies
13+
<pre id='HPG9CAKfUT3'>git clone https://github.com/paulu-aws/chembl-opentargets-data-lake-example.git<br>cd chembl-opentargets-data-lake-example<br>./InstallCdkDependencies.sh<br>./DeployChemblOpenTargetsEnv.sh</pre>
1614

17-
<br/></li></ul></div><pre id='HPG9CAwKcSI'>./InstallCdkDependencies.sh</pre>
15+
Wait for Chembl and OpenTargets to be ‘staged’ into the baseline stack.<br/>
1816

19-
<div style="" data-section-style='6' class="list-numbering-continue"><ul id='HPG9CAHILzl'><li id='HPG9CAK6MvP' class='' value='1'>Deploy the CDK Stacks
17+
<br/>
2018

21-
<br/></li></ul></div><pre id='HPG9CAeahes'>./DeployChemblOpenTargetsEnv.sh</pre>
19+
The ‘baseline stack’ in the CDK application spins up a VPC with an S3 bucket (for OpenTargets) and an RDS Postgres instance (for Chembl). It also spins up a little helper EC2 instance that stages those assets in their ‘raw’ form after downloading them from<a href="http://OpenTargets.org"> OpenTargets.org</a> and EMBL-EBI.<br/>
2220

23-
<div class="list-numbering-restart-at" data-section-style='6' style="--indent0: 4"><ul id='HPG9CAS1C4M'><li id='HPG9CA0az5T' class='parent' value='1'>Wait for Chembl and OpenTargets to be ‘staged’ into the baseline stack.
21+
<br/>
2422

25-
<br/></li><ul><li id='HPG9CAimAzR' class=''>The ‘baseline stack’ in the CDK application spins up a VPC with an S3 bucket (for OpenTargets) and an RDS Postgres instance (for Chembl). It also spins up a little helper EC2 instance that stages those assets in their ‘raw’ form<a href="http://OpenTargets.org"> OpenTargets.org</a> and EMBL-EBI into your account.
23+
Go to Systems Manager in the AWS console, and then the ‘Run Command’ section. You will see the currently running command documents. <br/>
2624

27-
<br/></li><li id='HPG9CA9PXT3' class=''>Go to Systems Manager in the AWS console, and then the ‘Run Command’ section. You will see the currently running command documents. 
25+
<div data-section-style='11' style='max-width:168%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/x4lfduQeC3Ww-DyK8loIAg?a=6aMBuWAgnWaZ5pQaJndaM06ob734VpmiCI5xfguyPaca' id='HPG9CA9WNsB' alt='' width='1276' height='612'></img></div>It takes about an hour for Chembl to build. If you get impatient and want to see the progress in real time, go to ‘Session Manager in the Systems Manager console, click the ‘Start session’ button, choose the ‘ChembDbImportInstance’ radio button, and click the ‘Start Session’ button.<br/>
2826

29-
<br/></li><li id='HPG9CA9WNsB' class=''><span data-section-style='11' style='max-width:168%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/x4lfduQeC3Ww-DyK8loIAg?a=6aMBuWAgnWaZ5pQaJndaM06ob734VpmiCI5xfguyPaca' id='HPG9CA9WNsB' alt='' width='1276' height='612'></img></span></li><li id='HPG9CAQFK7w' class=''>It takes about an hour for Chembl to build. If you get impatient and want to see the progress in real time, go to ‘Session Manager’ in the Systems Manager console, click the ‘Start session’ button, choose the ‘ChembDbImportInstance’ radio button, and click the ‘Start Session’ button.
27+
<div data-section-style='11' style='max-width:155%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/Fj7sA3VuIuvdPOHl017Xcg?a=EYFlHaKY8weEGFezDR4ld3sEhBMWl88afFdDjJQ15H8a' id='HPG9CADqhgF' alt='' width='1242' height='666'></img></div>That will open a SSM session window. Run the following command to tail the log output.<br/>
3028

31-
<br/></li><li id='HPG9CADqhgF' class=''><span data-section-style='11' style='max-width:155%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/Fj7sA3VuIuvdPOHl017Xcg?a=EYFlHaKY8weEGFezDR4ld3sEhBMWl88afFdDjJQ15H8a' id='HPG9CADqhgF' alt='' width='1242' height='666'></img></span></li><li id='HPG9CAByHqU' class=''>That will open a SSM session window run the following command
29+
<pre id='HPG9CAuziva'>tail -f /home/ssm-user/progressLog</pre>
3230

33-
<br/></li></ul></ul></div><pre id='HPG9CAuziva'> <code>tail -f progressLog</code></pre>
31+
<div data-section-style='11' class='tall' style='max-width:147%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/rMcRhjzUcIGQVYeBFxup4Q?a=2NRscRrktD9kLK7rDqqD9bO3aXtTYttCeaEWLwDXVgIa' id='HPG9CAgo8Yy' alt='' width='1115' height='1030'></img></div><br/>
3432

35-
<div style="" data-section-style='6' class=""><ul id='HPG9CAaYWQI'><li id='HPG9CAgo8Yy' class='' value='1'><span data-section-style='11' class='tall' style='max-width:147%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/rMcRhjzUcIGQVYeBFxup4Q?a=2NRscRrktD9kLK7rDqqD9bO3aXtTYttCeaEWLwDXVgIa' id='HPG9CAgo8Yy' alt='' width='1115' height='1030'></img></span></li></ul></div><br/>
33+
<h2 id='HPG9CAe1Pmp'>Enroll Chembl and OpenTargets into the data lake</h2>
3634

37-
<div style="" data-section-style='6' class="list-numbering-continue"><ul id='HPG9CA8jP0B'><li id='HPG9CA6hIcf' class='' value='1'>Once the database has finished importing, go to Glue in the AWS console, and then the “Workflows” section
35+
Once the database has finished importing, go to Glue in the AWS console, and then the “Workflows” section<br/>
3836

39-
<br/></li><li id='HPG9CADnepH' class=''><span data-section-style='11' style='max-width:147%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/K0liqaLzOGNHdODU_fN_MA?a=GQQahtSxVQNvaU6AkEjATwCE0WJglr630LH3bZcngB0a' id='HPG9CADnepH' alt='' width='1177' height='631'></img></span></li><li id='HPG9CApeYdR' class=''>Select the openTargetsDataLakeEnrollment workflow, and click ‘Actions’, then 'Run'
37+
<div data-section-style='11' style='max-width:147%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/K0liqaLzOGNHdODU_fN_MA?a=GQQahtSxVQNvaU6AkEjATwCE0WJglr630LH3bZcngB0a' id='HPG9CADnepH' alt='' width='1177' height='631'></img></div>Select the openTargetsDataLakeEnrollment workflow, and click ‘Actions’, then 'Run'<br/>
4038

41-
<br/></li><li id='HPG9CAgkuAH' class=''><span data-section-style='11' style='max-width:147%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/UV0-ZlwmK_KF9L9MfaUgfA?a=97k7vof4qlurzy3zSsmPVhomgCpRUJfREq8UCNZSzt4a' id='HPG9CAgkuAH' alt='' width='1177' height='631'></img></span></li><li id='HPG9CA1tRSd' class=''>Do the same for the chemblDataLakeEnrollmentWorkflow
39+
<div data-section-style='11' style='max-width:147%'><img src='https://quip-amazon.com/blob/HPG9AAwumxR/UV0-ZlwmK_KF9L9MfaUgfA?a=97k7vof4qlurzy3zSsmPVhomgCpRUJfREq8UCNZSzt4a' id='HPG9CAgkuAH' alt='' width='1177' height='631'></img></div>Do the same for the chemblDataLakeEnrollmentWorkflow<br/>
4240

43-
<br/></li><li id='HPG9CA4pCXY' class=''>Wait for the workflows to finish.
41+
Wait for the workflows to finish.<br/>
4442

45-
<br/></li></ul></div><br/>
43+
<br/>
4644

4745
You can now query opentargets and chembl data through Athena!<br/>
4846

0 commit comments

Comments
 (0)