-
Notifications
You must be signed in to change notification settings - Fork 63
Description
When I aligned a pair of fastqs using BWA-MEM2, Bowtie2, or SNAP, and an enforced interval size of 0-500, I got similar results. When I aligned with BWA-MEM2 or Bowtie2 and an enforced interval size of 50-500, the results didn't change much; very few on-target alignments had intervals <50 bp. But when I aligned with SNAP and an enforced (using -fs) interval size of 50-500, many fewer reads aligned. Furthermore, here is an example of a read pair that no longer aligned with a minimum interval of 50 bp, despite the fact that it's a 120 bp properly paired interval:
MN01688:12:000H3MG5N:1:23106:10622:10676 99 chr1 215878739 70 119M = 215878739 120 AAAAATCATAGTCACCTTCTCTTACCTCAAATTAGGTCCATTTGGCTTGGATGGTGGTTGCCAAGAAATCACAACATATGATTCACTTAGTGGAATCACAGACAATGGGCCAACATTCT FFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFFFFFFFFAFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF LB:Z:lb MC:Z:120M MD:Z:119 PG:Z:SNAP RG:Z:rg PL:Z:ILLUMINA NM:i:0 SM:Z:sm MQ:i:70 UQ:i:0 QS:i:4389 PU:Z:pu ms:i:4389
MN01688:12:000H3MG5N:1:23106:10622:10676 147 chr1 215878739 70 120M = 215878739 -120 AAAAATCATAGTCACCTTCTCTTACCTCAAATTAGGTCCATTTGGCTTGGATGGTGGTTGCCAAGAAATCACAACATATGATTCACTTAGTGGAATCACAGACAATGGGCCAACATTCTG FFFFFFFFFFFFFFFFFFFF=FFFFFFFFFFAFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF LB:Z:lb MC:Z:119M MD:Z:120 PG:Z:SNAP RG:Z:rg PL:Z:ILLUMINA NM:i:0 SM:Z:sm MQ:i:70 UQ:i:0 QS:i:4388 PU:Z:pu ms:i:4388
I also recapitulated this on the NA12878 data NIST7035_TAAGGCGA; the alignment rate fell from 97% to 59%. And it only seems to happen with the -fs option.
samtools flagstat reports:
-s 0 500:
39731146 + 0 in total (QC-passed reads + QC-failed reads)
39731146 + 0 primary
0 + 0 secondary
0 + 0 supplementary
3114693 + 0 duplicates
3114693 + 0 primary duplicates
39079698 + 0 mapped (98.36% : N/A)
39079698 + 0 primary mapped (98.36% : N/A)
39731146 + 0 paired in sequencing
19865573 + 0 read1
19865573 + 0 read2
38521349 + 0 properly paired (96.96% : N/A)
38615594 + 0 with itself and mate mapped
464104 + 0 singletons (1.17% : N/A)
19728 + 0 with mate mapped to a different chr
14333 + 0 with mate mapped to a different chr (mapQ>=5)
-s 0 500 -fs (similar results):
39731146 + 0 in total (QC-passed reads + QC-failed reads)
39731146 + 0 primary
0 + 0 secondary
0 + 0 supplementary
2899310 + 0 duplicates
2899310 + 0 primary duplicates
38627431 + 0 mapped (97.22% : N/A)
38627431 + 0 primary mapped (97.22% : N/A)
39731146 + 0 paired in sequencing
19865573 + 0 read1
19865573 + 0 read2
38533349 + 0 properly paired (96.99% : N/A)
38556972 + 0 with itself and mate mapped
70459 + 0 singletons (0.18% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
-s 50 500 (similar results):
39731146 + 0 in total (QC-passed reads + QC-failed reads)
39731146 + 0 primary
0 + 0 secondary
0 + 0 supplementary
3144369 + 0 duplicates
3144369 + 0 primary duplicates
39074551 + 0 mapped (98.35% : N/A)
39074551 + 0 primary mapped (98.35% : N/A)
39731146 + 0 paired in sequencing
19865573 + 0 read1
19865573 + 0 read2
23181753 + 0 properly paired (58.35% : N/A)
38607952 + 0 with itself and mate mapped
466599 + 0 singletons (1.17% : N/A)
333828 + 0 with mate mapped to a different chr
47673 + 0 with mate mapped to a different chr (mapQ>=5)
-s 50 500 -fs (abnormally low alignment rate):
39731146 + 0 in total (QC-passed reads + QC-failed reads)
39731146 + 0 primary
0 + 0 secondary
0 + 0 supplementary
1658040 + 0 duplicates
1658040 + 0 primary duplicates
23583000 + 0 mapped (59.36% : N/A)
23583000 + 0 primary mapped (59.36% : N/A)
39731146 + 0 paired in sequencing
19865573 + 0 read1
19865573 + 0 read2
23344882 + 0 properly paired (58.76% : N/A)
23512538 + 0 with itself and mate mapped
70462 + 0 singletons (0.18% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
For comparison:
Bowtie2 --very-fast-local -I 0 --no-discordant (default):
39731146 + 0 in total (QC-passed reads + QC-failed reads)
39731146 + 0 primary
0 + 0 secondary
0 + 0 supplementary
2509954 + 0 duplicates
2509954 + 0 primary duplicates
39685464 + 0 mapped (99.89% : N/A)
39685464 + 0 primary mapped (99.89% : N/A)
39731146 + 0 paired in sequencing
19865573 + 0 read1
19865573 + 0 read2
39514079 + 0 properly paired (99.45% : N/A)
39653236 + 0 with itself and mate mapped
32228 + 0 singletons (0.08% : N/A)
13340 + 0 with mate mapped to a different chr
8318 + 0 with mate mapped to a different chr (mapQ>=5)
Bowtie2 --very-fast-local -I 50 --no-discordant:
39731146 + 0 in total (QC-passed reads + QC-failed reads)
39731146 + 0 primary
0 + 0 secondary
0 + 0 supplementary
2510277 + 0 duplicates
2510277 + 0 primary duplicates
39685267 + 0 mapped (99.88% : N/A)
39685267 + 0 primary mapped (99.88% : N/A)
39731146 + 0 paired in sequencing
19865573 + 0 read1
19865573 + 0 read2
39382600 + 0 properly paired (99.12% : N/A)
39652842 + 0 with itself and mate mapped
32425 + 0 singletons (0.08% : N/A)
19408 + 0 with mate mapped to a different chr
8593 + 0 with mate mapped to a different chr (mapQ>=5)