Skip to content

e11bio/pool-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pool-analysis

Code for processing and analyzing the Nanopore data of viral pools. Currently, this data is generated by Plasmidsaurus' AAV sequencing service and their Premium PCR sequencing service.

Context

This is the code to analyze library balance in Nanopore sequencing runs of either AAV genomes or the plasmids used to create them as part of a standard QC pipeline. It is used at steps 2 and 3 of this general pipeline. Untitled (2)

Input and output files

  1. A fastq file with a set of Nanopore reads corresponding to a library of variants. The reads are derived from a either a pool of extracted viral genomes (where the capsid is digested and the Nanopore adapters ligated to the AAV ITRs) or a pool of linearized plasmids (typically the input plasmids used for AAV synthesis).
  2. A set of FASTA files that describe all of the expected variant sequence to align against. This is because it's much easier to align to a known reference than de-novo call each read.

We generate:

  1. A list of reads that map to a specific sequence and correspondingly the counts of aligned reads.
  2. A list of variants that contain a mutation relative to the expected input sequence (where the variant is observed at least twice by default).
  3. A description of e.g. the # of truncated products that are not full length AAV products (WIP progress component)
  4. Eventually, a description of how many genomes are in the correct orientation (specifically in the case of recombination cassettes).

Installation

Notes may be incomplete at the current moment!

Required python packages

Make a virtual environment with conda and use pip for the installations.

conda create -n pool_analysis
Conda activate pool_analysis 
conda install pip
pip install -r requirements.txt

Required system tools

On OSx, assuming you have homebrew installed:

brew install minimap2 samtools bcftools seqkit

On Ubuntu (or WSL):

sudo apt samtools bcftools seqkit

On Ubuntu, install minimap following the Minimap2 instructions.

Test minimap install

You can test that minimap2 runs correctly using the test files in this directory. This should throw no errors and output a blank file.

minimap2 -a test/MT-human.fa test/MT-orang.fa > test.sam

IGV (for visualization of results)

IGV Desktop Application, follow instructions for your system.

About

Code for processing and analyzing the Nanopore data of viral pools.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors