File tree Expand file tree Collapse file tree 3 files changed +36
-3
lines changed
Expand file tree Collapse file tree 3 files changed +36
-3
lines changed Original file line number Diff line number Diff line change @@ -127,7 +127,14 @@ def read(text)
127127
128128 private
129129 EMPTY = "" . freeze
130- SIG_REGEX = /(--|__|\w -$)|(^(\w +\s *){1,3} #{ "Sent from my" . reverse } $)/n
130+ SIGNATURE = '(?m)(--|__|\w-$)|(^(\w+\s*){1,3} ym morf tneS$)'
131+
132+ begin
133+ require 're2'
134+ SIG_REGEX = RE2 ::Regexp . new ( SIGNATURE )
135+ rescue LoadError
136+ SIG_REGEX = Regexp . new ( SIGNATURE )
137+ end
131138
132139 ### Line-by-Line Parsing
133140
@@ -139,7 +146,7 @@ def read(text)
139146 # Returns nothing.
140147 def scan_line ( line )
141148 line . chomp! ( "\n " )
142- line . lstrip! unless line =~ SIG_REGEX
149+ line . lstrip! unless SIG_REGEX . match ( line )
143150
144151 # We're looking for leading `>`'s to see if this line is part of a
145152 # quoted Fragment.
@@ -148,7 +155,7 @@ def scan_line(line)
148155 # Mark the current Fragment as a signature if the current line is empty
149156 # and the Fragment starts with a common signature indicator.
150157 if @fragment && line == EMPTY
151- if @fragment . lines . last =~ SIG_REGEX
158+ if SIG_REGEX . match @fragment . lines . last
152159 @fragment . signature = true
153160 finish_fragment
154161 end
Original file line number Diff line number Diff line change @@ -156,6 +156,12 @@ def test_one_is_not_on
156156 assert_match /^On Oct 1, 2012/ , reply . fragments [ 1 ] . to_s
157157 end
158158
159+ def test_pathological_emails
160+ t0 = Time . now
161+ reply = email ( "pathological" )
162+ assert ( Time . now - t0 ) < 1
163+ end
164+
159165 def email ( name )
160166 body = IO . read EMAIL_FIXTURE_PATH . join ( "#{ name } .txt" ) . to_s
161167 EmailReplyParser . read body
Original file line number Diff line number Diff line change 1+ I think you're onto something. I will try to fix the problem as soon as I
2+ get back to a computer.
3+ On Dec 8, 2013 2:10 PM, "John Sullivan" <
[email protected] > wrote:
4+
5+ > I think your code is shortening the reference sequence you return to be
6+ > the same size as the query sequence, and we end up losing data. Here's some
7+ > debugging output from me putzing around...
8+ >
9+ > name: gi|253409428|ref|GQ227366.1| Influenza A virus (A/pika/Qinghai/BI/2007(H5N1)) segment 1 polymerase PB2 (PB2) gene, complete cds
10+ > score: 39.0
11+ >
12+ > organism.sequence: ATGGAGAGAATAAAGGAATTAAGAGATCTAATGTCACAGTCCCGCACTCGCGAGATACTAACAAAGACCACTGTGGACCATATGGCCATAATCAAGAAATACACATCAGGAAGACAAGAGAAGAACCCTGCTCTCAGAATGAAATGGATGATGGCAATGAAATATCCAATCACAGCGGACAAGAGAATAATAGAGATGATTCCTGAAAGGAATGAACAAGGACAGACACTCTGGAGCAAGACAAATGATGCTGGATCGGACAGGGTGATGGTGTCTCCCCTAGCTGTAACTTGGTGGAATAGGAATGGGCCGACGACAAGTACAGTTCATTATCCAAAGGTTTACAAAACATACTTTGAGAAGGTTGAAAGGTTAAAACATGGAACCTTCGGTCCCGTTCATTTCCGAAACCAAGTTAAAATACGCCGCCGAGTTGATACAAATCCTGGCCATGCAGATCTCAGTGCTAAAGAAGCACAAGATGTCATCATGGAGGTCGTTTTCCCAAATGAAGTGGGAGCTAGAATATTGACTTCAGAGTCACAGTTGACAATAACGAAAGAGAAAAAAGAAGAGCTCCAAGATTGTAAGATTGCTCCCTTAATGGTTGCATACATGTTGGAAAGGGAACTGGTCCGCAAAACCAGATTCCTACCAGTAGCAGGCGGAACAAGCAGTGTGTACATTGAGGTATTGCATTTGACTCAAGGAACCTGCTGGGCACAGATGTACACTCCAGGCGGAGAAGTAAGAAATGACGATGTTGACCAGAGTTTGATCATTGCTGCCAGAAACATTGTTAGGAGAGCAACGGTATCAGCGGATCCACTGGCATCACTGCTGGAGATGTGTCACAGCACACAAATTGGTGGGATAAGGATGGTGGACATCCTTAGGCAAACTCCAACTGAGGAACAAGCTGTGGATATATGCAAAGCAGCAATGGGTCTGAGGATTAGTTCATCCTTTAGCTTTGGAG
13+ > GCTTCACTTTCAAAAGAACAAGTGGATCATCCGCCACGAAGGAAGAGGAAGTGCTTACAGGCAACCTCCAAACATTGAAAATAAGAGTACATGAGGGGTATGAGGAGTTCACAATGGTTGGGCAGAGGGCAACAGCTATCCTGAGGAAAGCAACTAGAAGGCTGATTCAGTTGATAGTAAGTGGAAGAAACGAACAATCAATCGCTGAGGCAATCATTGTAGCAATGGTGTTCTCACAGGAGGATCGCATGATAAAAGCAGTCCGAGGCGATCTGAATTTCGTAAACAGAGCAAACCAAAGATTAAACCCCATGCATCAACTCCTGAGACATTTTCAAAAGGACGCAAAAGTGCTATTTCAGAATTGGGGAACTGAGCCAATTGATAATGTCATGGGGATGATCGGAATATTACCTGACATGACTCCCAGCACAGAAACGTCACTGAGAGGAGTGAGAGTTAGTAAAATGGGAGTAGATGAGTATTCCAGCACTGAGAGAGTAGTTGTAAGCATTGACCGCTTCTTAAGGGTTCGAGACCAGCGGGGGAACGTACTCTTATCTCCCGAAGAGGTCAGCGAAACCCAGGGAACAGAGAAGTTGACAATAACATATTCATCATCAATGATGTGGGAAATCAACGGTCCTGAGTCAGTGCTTGTTAACACTTACCAATGGATCATTAGAAACTGGGAGACCGTGAAAATTCAGTGGTCTCAGGACCCCACGATGTTGTACAATAAGATGGAGTTTGAACCGTTCCAATCCTTGGTACCTAAAGCTGCCAGAGGTCAATACAGTGGATTTGTGAGAACATTATTCCAACAAATGCGTGACGTACTGGGGACATTTGATACTGTCCAGATAATAAAGCTGCTACCATTTGCAGCAGCCCCACCGAAGCAGAGCAGAATGCAGTTTTCTTCTCTAACTGTGAATGTGAGAGGCTCAGGAATGAGAATACTCATAAGGGGCAATTCCCCTGTGTTCAACTACAA
14+ > TAAGGCAACCCAAAGACTTACCGTTCTTGGAAAGGACGCAGGTGCATTAACAGAGGATCCAGATGAGGGGACAGCCGGAGTGGAATCTGCAGTACTGAGGGGGTTCCTAATTCTAGGCAAGGAGGACAAAAGATATGGACCAGCATTGAGCATCAATGAACTGAGCAATCTTGCAAAAGGGGAGAAAGCTAATGTGCTGATAGGGCAAGGAGACGTGGTGTTGGTAATGAAACGGAAACGGGACTCTAGCATACTTACTGACAGCCAGACAGCGACCAAAAGAATTCGGATGGCCATCAATTAGTGTCGAATTGTTTAAAAACGACCTTGTTTCTACT
15+ > reference_alignment: ________________________________________________
16+ >
17+ > query: AGCGAAAGCAGGTCAAATATATTCAATATGGAGAGAATAAAAGAATTAAG
18+ >
19+ > query_alignment: GCGAAAGCAGGTCAAATATATTCAATATGGAGAGAATAAAAGAATTAAG
20+ >
You can’t perform that action at this time.
0 commit comments