All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning.
- Adding German edge cases where "Von, Gesendet, An" appears in a different order.
- Adding
ExtendedEmailReplyParser.extract_text_or_htmlwhich falls back to the html part if the text part is missing. ExtendedEmailReplyParser.parseusesextract_text_or_html. This is good, because html-only mails do not just returnnil. But, be aware thatExtendedEmailReplyParser.parsemay include html tags this way. If this causes you troble, useExtendedEmailReplyParser.parse(ExtendedEmailReplyParser.extract_text(message_or_path))instead, which only uses the text part. We might correct the behavior in the future in order to strip the html tags during the parsing.
Mail::Message#extract_htmlextracts the html part as a counterpart forMail::Message#extract_text. This is useful when an email has no text part.Parsers::HtmlMails#parseremoves quotes indicated by<div name="quote"></div>.
- Fixing issue "undefined
body_in_utf8fornil".
ExtendedEmailReplyParser#extract_text(message)as alternative tomessage.extract_textin order to indicate where the method comes from. When usingmessage.extract_textone is lead to look for the definition inMail::Messagedirectly.
-
Parsers::Base#hide_everything_after(expressions)is useful when email clients do not quote the previous conversation. This parser method hides everything lead by a series of expressions, e.g.hide_everything_after %w(From: Sent: To:). -
Parsers::Base#except_in_visible_block_quotes. Within this block,hide_everything_afteris not applied. This is useful when a quote is already marked as to be shown. -
German parser
Parsers::I18nDe, which removes previous conversation by searching for the phrases "Gesendet: Von: An:" and "Am ... schrieb ...:". -
Support for i18n-ed header lines. The github parser only knows "On ... wrote". Since this is needed when the github parser runs, specify additional regexes in the class header of the parsers using
add_quote_header_regex, for example:add_quote_header_regex '^Am .* schrieb.*$'. -
The German parser adds the regex for quote headers like "Am ... schrieb ...:".
-
Remove empty lines between quote lines:
> Hi, > how are you doing? > Cheersrather than
> Hi, > how are you doing? > Cheers -
English parser
Parsers::I18nEn, which removes previous conversation by searching for the phrases "From: Sent: To".
ExtendedEmailReplyParser.read "/path/to/email.eml"returns the correspondingMail::Messageobject.ExtendedEmailReplyParser.read("/path/to/email.eml").extract_textreturns the email body text in utf-8. See also: http://stackoverflow.com/a/15818886/2066546- Methods for parsing emails:
ExtendedEmailReplyParser.parse("/path/to/email.eml"),ExtendedEmailReplyParser.parse(message),ExtendedEmailReplyParser.parse(body_text),Mail::Message#parse. - Allowing to chain custom parsers. A parser inherits from
ExtendedEmailReplyParser::Parsers::Baseand needs to implement aparsemethod. The parser is automatically chained along with the others when callingExtendedEmailReplyParser.parseorMailMessage#parse. - Wrapping the original github/email_reply_parser in
ExtendedEmailReplyParser::Parsers::Github.