Skip to content

Commit 6b3bbc0

Browse files
author
Vicent Marti
committed
Add support for re2 if it's available
1 parent 28776de commit 6b3bbc0

File tree

3 files changed

+36
-3
lines changed

3 files changed

+36
-3
lines changed

lib/email_reply_parser.rb

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,14 @@ def read(text)
127127

128128
private
129129
EMPTY = "".freeze
130-
SIG_REGEX = /(--|__|\w-$)|(^(\w+\s*){1,3} #{"Sent from my".reverse}$)/n
130+
SIGNATURE = '(?m)(--|__|\w-$)|(^(\w+\s*){1,3} ym morf tneS$)'
131+
132+
begin
133+
require 're2'
134+
SIG_REGEX = RE2::Regexp.new(SIGNATURE)
135+
rescue LoadError
136+
SIG_REGEX = Regexp.new(SIGNATURE)
137+
end
131138

132139
### Line-by-Line Parsing
133140

@@ -139,7 +146,7 @@ def read(text)
139146
# Returns nothing.
140147
def scan_line(line)
141148
line.chomp!("\n")
142-
line.lstrip! unless line =~ SIG_REGEX
149+
line.lstrip! unless SIG_REGEX.match(line)
143150

144151
# We're looking for leading `>`'s to see if this line is part of a
145152
# quoted Fragment.
@@ -148,7 +155,7 @@ def scan_line(line)
148155
# Mark the current Fragment as a signature if the current line is empty
149156
# and the Fragment starts with a common signature indicator.
150157
if @fragment && line == EMPTY
151-
if @fragment.lines.last =~ SIG_REGEX
158+
if SIG_REGEX.match @fragment.lines.last
152159
@fragment.signature = true
153160
finish_fragment
154161
end

test/email_reply_parser_test.rb

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,12 @@ def test_one_is_not_on
156156
assert_match /^On Oct 1, 2012/, reply.fragments[1].to_s
157157
end
158158

159+
def test_pathological_emails
160+
t0 = Time.now
161+
reply = email("pathological")
162+
assert (Time.now - t0) < 1
163+
end
164+
159165
def email(name)
160166
body = IO.read EMAIL_FIXTURE_PATH.join("#{name}.txt").to_s
161167
EmailReplyParser.read body

test/emails/pathological.txt

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
I think you're onto something. I will try to fix the problem as soon as I
2+
get back to a computer.
3+
On Dec 8, 2013 2:10 PM, "John Sullivan" <[email protected]> wrote:
4+
5+
> I think your code is shortening the reference sequence you return to be
6+
> the same size as the query sequence, and we end up losing data. Here's some
7+
> debugging output from me putzing around...
8+
>
9+
> name: gi|253409428|ref|GQ227366.1| Influenza A virus (A/pika/Qinghai/BI/2007(H5N1)) segment 1 polymerase PB2 (PB2) gene, complete cds
10+
> score: 39.0
11+
>
12+
> organism.sequence: ATGGAGAGAATAAAGGAATTAAGAGATCTAATGTCACAGTCCCGCACTCGCGAGATACTAACAAAGACCACTGTGGACCATATGGCCATAATCAAGAAATACACATCAGGAAGACAAGAGAAGAACCCTGCTCTCAGAATGAAATGGATGATGGCAATGAAATATCCAATCACAGCGGACAAGAGAATAATAGAGATGATTCCTGAAAGGAATGAACAAGGACAGACACTCTGGAGCAAGACAAATGATGCTGGATCGGACAGGGTGATGGTGTCTCCCCTAGCTGTAACTTGGTGGAATAGGAATGGGCCGACGACAAGTACAGTTCATTATCCAAAGGTTTACAAAACATACTTTGAGAAGGTTGAAAGGTTAAAACATGGAACCTTCGGTCCCGTTCATTTCCGAAACCAAGTTAAAATACGCCGCCGAGTTGATACAAATCCTGGCCATGCAGATCTCAGTGCTAAAGAAGCACAAGATGTCATCATGGAGGTCGTTTTCCCAAATGAAGTGGGAGCTAGAATATTGACTTCAGAGTCACAGTTGACAATAACGAAAGAGAAAAAAGAAGAGCTCCAAGATTGTAAGATTGCTCCCTTAATGGTTGCATACATGTTGGAAAGGGAACTGGTCCGCAAAACCAGATTCCTACCAGTAGCAGGCGGAACAAGCAGTGTGTACATTGAGGTATTGCATTTGACTCAAGGAACCTGCTGGGCACAGATGTACACTCCAGGCGGAGAAGTAAGAAATGACGATGTTGACCAGAGTTTGATCATTGCTGCCAGAAACATTGTTAGGAGAGCAACGGTATCAGCGGATCCACTGGCATCACTGCTGGAGATGTGTCACAGCACACAAATTGGTGGGATAAGGATGGTGGACATCCTTAGGCAAACTCCAACTGAGGAACAAGCTGTGGATATATGCAAAGCAGCAATGGGTCTGAGGATTAGTTCATCCTTTAGCTTTGGAG
13+
> GCTTCACTTTCAAAAGAACAAGTGGATCATCCGCCACGAAGGAAGAGGAAGTGCTTACAGGCAACCTCCAAACATTGAAAATAAGAGTACATGAGGGGTATGAGGAGTTCACAATGGTTGGGCAGAGGGCAACAGCTATCCTGAGGAAAGCAACTAGAAGGCTGATTCAGTTGATAGTAAGTGGAAGAAACGAACAATCAATCGCTGAGGCAATCATTGTAGCAATGGTGTTCTCACAGGAGGATCGCATGATAAAAGCAGTCCGAGGCGATCTGAATTTCGTAAACAGAGCAAACCAAAGATTAAACCCCATGCATCAACTCCTGAGACATTTTCAAAAGGACGCAAAAGTGCTATTTCAGAATTGGGGAACTGAGCCAATTGATAATGTCATGGGGATGATCGGAATATTACCTGACATGACTCCCAGCACAGAAACGTCACTGAGAGGAGTGAGAGTTAGTAAAATGGGAGTAGATGAGTATTCCAGCACTGAGAGAGTAGTTGTAAGCATTGACCGCTTCTTAAGGGTTCGAGACCAGCGGGGGAACGTACTCTTATCTCCCGAAGAGGTCAGCGAAACCCAGGGAACAGAGAAGTTGACAATAACATATTCATCATCAATGATGTGGGAAATCAACGGTCCTGAGTCAGTGCTTGTTAACACTTACCAATGGATCATTAGAAACTGGGAGACCGTGAAAATTCAGTGGTCTCAGGACCCCACGATGTTGTACAATAAGATGGAGTTTGAACCGTTCCAATCCTTGGTACCTAAAGCTGCCAGAGGTCAATACAGTGGATTTGTGAGAACATTATTCCAACAAATGCGTGACGTACTGGGGACATTTGATACTGTCCAGATAATAAAGCTGCTACCATTTGCAGCAGCCCCACCGAAGCAGAGCAGAATGCAGTTTTCTTCTCTAACTGTGAATGTGAGAGGCTCAGGAATGAGAATACTCATAAGGGGCAATTCCCCTGTGTTCAACTACAA
14+
> TAAGGCAACCCAAAGACTTACCGTTCTTGGAAAGGACGCAGGTGCATTAACAGAGGATCCAGATGAGGGGACAGCCGGAGTGGAATCTGCAGTACTGAGGGGGTTCCTAATTCTAGGCAAGGAGGACAAAAGATATGGACCAGCATTGAGCATCAATGAACTGAGCAATCTTGCAAAAGGGGAGAAAGCTAATGTGCTGATAGGGCAAGGAGACGTGGTGTTGGTAATGAAACGGAAACGGGACTCTAGCATACTTACTGACAGCCAGACAGCGACCAAAAGAATTCGGATGGCCATCAATTAGTGTCGAATTGTTTAAAAACGACCTTGTTTCTACT
15+
> reference_alignment: ________________________________________________
16+
>
17+
> query: AGCGAAAGCAGGTCAAATATATTCAATATGGAGAGAATAAAAGAATTAAG
18+
>
19+
> query_alignment: GCGAAAGCAGGTCAAATATATTCAATATGGAGAGAATAAAAGAATTAAG
20+
>

0 commit comments

Comments
 (0)