-
Notifications
You must be signed in to change notification settings - Fork 165
Description
Cross-posted from cutadapt github because the author hasn't responded to other posts in a while...my issue is that for my ITS sequences, I am wanting to use cutadapt to remove primers but I cannot remove them entirely from my reverse reads no matter what I do.
Here is my code and output (I do have to run cutadapt on the terminal because I have had issues with it on qiime2 on our cluster and on R, but the flags and primers would essentially be identical):
# Primers
FWD <- "CTTGGTCATTTAGAGGAAGTAA"
REV <- "GCTGCGTTCTTCATCGATGC"
> FWD.orients <- allOrients(FWD); FWD.orients
Forward Complement Reverse RevComp
"CTTGGTCATTTAGAGGAAGTAA" "GAACCAGTAAATCTCCTTCATT" "AATGAAGGAGATTTACTGGTTC" "TTACTTCCTCTAAATGACCAAG"
> REV.orients <- allOrients(REV); REV.orients
Forward Complement Reverse RevComp
"GCTGCGTTCTTCATCGATGC" "CGACGCAAGAAGTAGCTACG" "CGTAGCTACTTCTTGCGTCG" "GCATCGATGAAGAACGCAGC"
# Check primers in original uncut/unfiltered files; not filtering because I lose way too many reads
rbind(FWD.ForwardReads = sapply(FWD.orients, primerHits, fn = fnFs[[1]]),
FWD.ReverseReads = sapply(FWD.orients,primerHits, fn = fnRs[[1]]),
REV.ForwardReads = sapply(REV.orients, primerHits,fn = fnFs[[1]]),
REV.ReverseReads = sapply(REV.orients, primerHits, fn = fnRs[[1]]))
# Output
Forward Complement Reverse RevComp
FWD.ForwardReads 27211 0 0 0
FWD.ReverseReads 1720 1715 1707 18415
REV.ForwardReads 0 0 0 26660
REV.ReverseReads 26766 1826 1728 1744
FWD <- FWD.orients[["Forward"]]; FWD # keep the same as is
REV <- REV.orients[["RevComp"]]; REV # because it matches to the revcomp
> REV.orients <- allOrients(REV); REV.orients
Forward Complement Reverse RevComp
"GCTGCGTTCTTCATCGATGC" "CGACGCAAGAAGTAGCTACG" "CGTAGCTACTTCTTGCGTCG" "GCATCGATGAAGAACGCAGC"
I then run cutadapt in terminal and normally this has never caused an issue with my single-end data for ITS, but this time I am working with paired-end and have issues with removing primers for reverse reads.
for r1 in *_R1.fastq; do r2=${r1/_R1.fastq/_R2.fastq}; sample=${r1%%_R1.fastq}; /home/chauk/.local/bin/cutadapt \
-a CTTGGTCATTTAGAGGAAGTAA...GCATCGATGAAGAACGCAGC -A GCTGCGTTCTTCATCGATGC...TACTTCCTCTAAATGACCAAG \
-n 2 -o original/${sample}_R1_cut1.fastq -p original/${sample}_R2_cut1.fastq $r1 $r2; done
Then I go back into R and run my sanity check which gives me the following output:
Forward Complement Reverse RevComp
FWD.ForwardReads 0 0 0 0
FWD.ReverseReads 1720 1715 1707 1518
REV.ForwardReads 0 0 0 0
REV.ReverseReads 1834 1826 1728 1744
Now, I've tried running the -A flag with different orientations for the reverse primer, and even did multiple rounds of primer removal but it doesn't change the fact that these original primers are still showing up in every orientation only in the reverse reads (even though a good chunk were removed from the reverse reads in the RevComp orientation and in the reverse reads in the Forward orientation.
I have done multiple rounds after seeing that the sequencing company has added adaptors to the primers and did it 4 times both for the forward and reverse primers - HOWEVER, it still works completely fine for forward reads where just using the actual primer sequence I can remove it entirely. Below are the additional adaptor+primer sequences used:
Forward1 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTTGGTCATTTAGAGGAAGTAA
Forward2 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTTCTTGGTCATTTAGAGGAAGTAA
Forward3 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCTTGGTCATTTAGAGGAAGTAA
Forward4 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTCAACTTGGTCATTTAGAGGAAGTAA
Reverse1 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGCTGCGTTCTTCATCGATGC
Reverse2 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTTGCTGCGTTCTTCATCGATGC
Reverse3 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTACGCTGCGTTCTTCATCGATGC
Reverse4 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCATGCTGCGTTCTTCATCGATGC
I am at a loss at what to do with this. Any advice would be appreciated!