Skip to content

Commit fff5d1d

Browse files
committed
add PCR plot
1 parent a8a7321 commit fff5d1d

File tree

2 files changed

+695
-7
lines changed

2 files changed

+695
-7
lines changed

Day1/ngs-intro.Rmd

Lines changed: 33 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -367,15 +367,48 @@ $ samtools flagstat NA19914.chr22.bam
367367
368368
```
369369

370+
## Aligned files in IGV
371+
372+
- Once our bam files have been *indexed* we can view them in IGV
373+
- This is **highly recommended**
374+
- Check-out [our colleagues' course](http://mrccsc.github.io/IGV_course/) for more details
375+
376+
![igv](../images/igv_screenshot1.png)
377+
378+
379+
380+
381+
382+
370383
## Post-processing of aligned files
371384

372385
- Marking of PCR duplicates
373386
+ PCR amplification errors can cause some sequences to be over-represented
374387
+ Chances of any two sequences aligning to the same position are *unlikely*
375388
+ Caveat: obviously this depends on amount of the genome you are capturing
389+
390+
```{r echo=FALSE,cache=TRUE}
391+
suppressPackageStartupMessages(library(GenomicAlignments))
392+
mybam <- "HG00096.chr22.bam"
393+
dupReads <- readGAlignments(file=mybam,param=ScanBamParam(scanBamFlag(isDuplicate = TRUE)))
394+
nodupReads <- readGAlignments(file=mybam,param=ScanBamParam(scanBamFlag(isDuplicate = FALSE)))
395+
suppressPackageStartupMessages(library(ggbio))
396+
gr1 <- GRanges("22",IRanges(16448767,16448866))
397+
gr1 <- flank(gr1, 10,both=TRUE)
398+
dupReads <- dupReads[dupReads %over% gr1]
399+
pcrDuplicate <- start(dupReads)==16448767 & end(dupReads) == 16448866
400+
mcols(dupReads)$pcrDuplicate <- pcrDuplicate
401+
autoplot(dupReads,aes(fill=pcrDuplicate)) + scale_fill_manual(values = c("black","red"))
402+
```
403+
404+
405+
## Post-processing of aligned files
406+
407+
- PCR duplicates
376408
+ Such reads are *marked* but not usually removed from the data
377409
+ Most downstream methods will ignore such reads
378410
+ Typically, [***picard***](http://broadinstitute.github.io/picard/) is used
411+
379412
- Sorting
380413
+ Reads can be sorted according to genomic position
381414
+ [***samtools***](http://www.htslib.org/)
@@ -385,13 +418,6 @@ $ samtools flagstat NA19914.chr22.bam
385418

386419

387420

388-
## Aligned files in IGV
389-
390-
- Once our bam files have been *indexed* we can view them in IGV
391-
- This is **highly recommended**
392-
393-
![igv](../images/igv_screenshot1.png)
394-
395421

396422
## Other misc. format
397423

Day1/ngs-intro.html

Lines changed: 662 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)