Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

readme.md

Small Example Dataset

This dataset was derived from a real methyl-seq data (ENA PRJNA730913).

Processing this dataset takes about 10 minutes.


Key contents


Running the pipeline

Please first edit config.yml to specify the location of your indexed hg19 genome ('paths:genome').

The entire pipeline can be applied to the dataset using the following command.

bash run-pipeline.sh

Recreate the dataset

The dataset has already been created, so there is no need to repeat it. All outputs are contained the data/ folder.

bash scripts/create-dataset.sh PATH/TO/ORIGINAL/DATA PATH/TO/GENOME/INDEX
  • PATH/TO/ORIGINAL/DATA The location where the original dataset should be (or has been) downloaded and aligned.
  • PATH/TO/GENOME/INDEX The location where the human genome (hg19) should be (or has been) downloaded and indexed.