Skip to content

getmail command crashes when faced to unknown header encoding. #210

@Asayu123

Description

@Asayu123

Problem Summary

  • getmail command crashes when faced to unknown encoding.

Environment

  • django-mailbox==4.7.1 + Python 3.6.3 + macOS mojave 10.14.6

Description

  • when decoding header bytes, util module call python's builtin decode method with the encoding attribute that stores the encoding name as string, but if the python builtin decode method does not support this encoding name, it raises LookupError.
  • Since there are no error handlers related to this error, getmail command crashes and we are unable to load subsequent emails.

Stacktrace

  • Following is the stack trace I faced.
  • The trigger of this exception was a text attachment file that has a Japanese file name, and it looks like it is created in a windows environment.
Traceback (most recent call last):
...
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/management/commands/getmail.py", line 24, in handle
    messages = mailbox.get_new_mail()
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 413, in get_new_mail
    msg = self.process_incoming_message(message)
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 230, in process_incoming_message
    msg = self._process_message(message)
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 368, in _process_message
    message = self._get_dehydrated_message(message, msg)
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 258, in _get_dehydrated_message
    self._get_dehydrated_message(part, record)
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 284, in _get_dehydrated_message
    filename = utils.convert_header_to_unicode(raw_filename)
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/utils.py", line 93, in convert_header_to_unicode
    ) for bytestr, encoding in email.header.decode_header(header)
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/utils.py", line 93, in <listcomp>
    ) for bytestr, encoding in email.header.decode_header(header)
  File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/utils.py", line 86, in _decode
    return value.decode(encoding, 'replace')
LookupError: unknown encoding: windows-31j

The solution I'm thinking.

  • Simply try - catch the existing decoding process like this.
        try:
            return value.decode(encoding, 'replace')
        except LookupError:
            logger.warning(
                'Faced to unknown decoding. Trying to replace decoding with default_charset to avoid a crash,'
                ' but it will cause garbling.')
            return value.decode(default_charset, 'replace')

Since the above approach changes the current behavior, we might need settings to control whether we use this rescue code or not like this.

from django.conf import settings
...
settings = get_settings()
...
except LookupError as e:
    if settings['rescue_unknown_encoding']:  # we need it to replace appropriate name. 
        ## do rescue operation
    else:
        raise

I do not much understand the design philosophy of this module, so if you have an any better idea, I'm glad if you let me know.

Thank you. : )

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions