Skip to content
ckennedy-nmdp edited this page Sep 10, 2014 · 5 revisions

This tutorial assumes you have an AWS account registered with the NMDP -- Jeremy needs to fill some of this in or I can link to his documentation

Get the code

git clone [email protected]:/parallel_genomic.git

Will create a local clone (working copy) of the GitHub repository, which contains several shell scripts for parallel execution of pipeline components.

View the sample data

Public sample data from the sequence read archive are provided here:
/mnt/common/data/incoming/nmdp/Proposed_Hackathon_Dataset/DRP000941/

These are phased NGS data for 6-loci HLA published by Hosomichi et al, 2013. There are 73 files, which must be decompressed from SRA format to FASTQ before processing. SRA provides tools for this purpose. The decompressed data are also provided in fastq/

Run the pipeline

Interpret and validate the results

DaSH

Clone this wiki locally