Skip to content

cannot remove primers on reverse reads using cutadapt #2172

@kdbchau

Description

@kdbchau

Cross-posted from cutadapt github because the author hasn't responded to other posts in a while...my issue is that for my ITS sequences, I am wanting to use cutadapt to remove primers but I cannot remove them entirely from my reverse reads no matter what I do.

Here is my code and output (I do have to run cutadapt on the terminal because I have had issues with it on qiime2 on our cluster and on R, but the flags and primers would essentially be identical):

# Primers
FWD <- "CTTGGTCATTTAGAGGAAGTAA"
REV <- "GCTGCGTTCTTCATCGATGC"

> FWD.orients <- allOrients(FWD); FWD.orients
                 Forward               Complement                  Reverse                  RevComp 
"CTTGGTCATTTAGAGGAAGTAA" "GAACCAGTAAATCTCCTTCATT" "AATGAAGGAGATTTACTGGTTC" "TTACTTCCTCTAAATGACCAAG" 

> REV.orients <- allOrients(REV); REV.orients
               Forward             Complement                Reverse                RevComp 
"GCTGCGTTCTTCATCGATGC" "CGACGCAAGAAGTAGCTACG" "CGTAGCTACTTCTTGCGTCG" "GCATCGATGAAGAACGCAGC"

# Check primers in original uncut/unfiltered files; not filtering because I lose way too many reads
rbind(FWD.ForwardReads = sapply(FWD.orients, primerHits, fn = fnFs[[1]]),
      FWD.ReverseReads = sapply(FWD.orients,primerHits, fn = fnRs[[1]]),
      REV.ForwardReads = sapply(REV.orients, primerHits,fn = fnFs[[1]]),
      REV.ReverseReads = sapply(REV.orients, primerHits, fn = fnRs[[1]]))

# Output
                 Forward Complement Reverse RevComp
FWD.ForwardReads   27211          0       0       0
FWD.ReverseReads    1720       1715    1707   18415
REV.ForwardReads       0          0       0   26660
REV.ReverseReads   26766       1826    1728    1744

FWD <- FWD.orients[["Forward"]]; FWD  # keep the same as is
REV <- REV.orients[["RevComp"]]; REV  # because it matches to the revcomp

> REV.orients <- allOrients(REV); REV.orients
               Forward             Complement                Reverse                RevComp 
"GCTGCGTTCTTCATCGATGC" "CGACGCAAGAAGTAGCTACG" "CGTAGCTACTTCTTGCGTCG" "GCATCGATGAAGAACGCAGC"

I then run cutadapt in terminal and normally this has never caused an issue with my single-end data for ITS, but this time I am working with paired-end and have issues with removing primers for reverse reads.

for r1 in *_R1.fastq; do r2=${r1/_R1.fastq/_R2.fastq}; sample=${r1%%_R1.fastq}; /home/chauk/.local/bin/cutadapt \
-a CTTGGTCATTTAGAGGAAGTAA...GCATCGATGAAGAACGCAGC -A GCTGCGTTCTTCATCGATGC...TACTTCCTCTAAATGACCAAG \
-n 2 -o original/${sample}_R1_cut1.fastq -p original/${sample}_R2_cut1.fastq $r1 $r2; done

Then I go back into R and run my sanity check which gives me the following output:

                 Forward Complement Reverse RevComp
FWD.ForwardReads       0          0       0       0
FWD.ReverseReads    1720       1715    1707    1518
REV.ForwardReads       0          0       0       0
REV.ReverseReads    1834       1826    1728    1744

Now, I've tried running the -A flag with different orientations for the reverse primer, and even did multiple rounds of primer removal but it doesn't change the fact that these original primers are still showing up in every orientation only in the reverse reads (even though a good chunk were removed from the reverse reads in the RevComp orientation and in the reverse reads in the Forward orientation.

I have done multiple rounds after seeing that the sequencing company has added adaptors to the primers and did it 4 times both for the forward and reverse primers - HOWEVER, it still works completely fine for forward reads where just using the actual primer sequence I can remove it entirely. Below are the additional adaptor+primer sequences used:

Forward1 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTTGGTCATTTAGAGGAAGTAA
Forward2 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTTCTTGGTCATTTAGAGGAAGTAA
Forward3 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCTTGGTCATTTAGAGGAAGTAA
Forward4 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTCAACTTGGTCATTTAGAGGAAGTAA

Reverse1 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGCTGCGTTCTTCATCGATGC
Reverse2 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTTGCTGCGTTCTTCATCGATGC
Reverse3 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTACGCTGCGTTCTTCATCGATGC
Reverse4 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCATGCTGCGTTCTTCATCGATGC

I am at a loss at what to do with this. Any advice would be appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions