@@ -367,15 +367,48 @@ $ samtools flagstat NA19914.chr22.bam
367367
368368```
369369
370+ ## Aligned files in IGV
371+
372+ - Once our bam files have been * indexed* we can view them in IGV
373+ - This is ** highly recommended**
374+ - Check-out [ our colleagues' course] ( http://mrccsc.github.io/IGV_course/ ) for more details
375+
376+ ![ igv] ( ../images/igv_screenshot1.png )
377+
378+
379+
380+
381+
382+
370383## Post-processing of aligned files
371384
372385- Marking of PCR duplicates
373386 + PCR amplification errors can cause some sequences to be over-represented
374387 + Chances of any two sequences aligning to the same position are * unlikely*
375388 + Caveat: obviously this depends on amount of the genome you are capturing
389+
390+ ``` {r echo=FALSE,cache=TRUE}
391+ suppressPackageStartupMessages(library(GenomicAlignments))
392+ mybam <- "HG00096.chr22.bam"
393+ dupReads <- readGAlignments(file=mybam,param=ScanBamParam(scanBamFlag(isDuplicate = TRUE)))
394+ nodupReads <- readGAlignments(file=mybam,param=ScanBamParam(scanBamFlag(isDuplicate = FALSE)))
395+ suppressPackageStartupMessages(library(ggbio))
396+ gr1 <- GRanges("22",IRanges(16448767,16448866))
397+ gr1 <- flank(gr1, 10,both=TRUE)
398+ dupReads <- dupReads[dupReads %over% gr1]
399+ pcrDuplicate <- start(dupReads)==16448767 & end(dupReads) == 16448866
400+ mcols(dupReads)$pcrDuplicate <- pcrDuplicate
401+ autoplot(dupReads,aes(fill=pcrDuplicate)) + scale_fill_manual(values = c("black","red"))
402+ ```
403+
404+
405+ ## Post-processing of aligned files
406+
407+ - PCR duplicates
376408 + Such reads are * marked* but not usually removed from the data
377409 + Most downstream methods will ignore such reads
378410 + Typically, [ *** picard*** ] ( http://broadinstitute.github.io/picard/ ) is used
411+
379412- Sorting
380413 + Reads can be sorted according to genomic position
381414 + [ *** samtools*** ] ( http://www.htslib.org/ )
@@ -385,13 +418,6 @@ $ samtools flagstat NA19914.chr22.bam
385418
386419
387420
388- ## Aligned files in IGV
389-
390- - Once our bam files have been * indexed* we can view them in IGV
391- - This is ** highly recommended**
392-
393- ![ igv] ( ../images/igv_screenshot1.png )
394-
395421
396422## Other misc. format
397423
0 commit comments