Problem Summary
- getmail command crashes when faced to unknown encoding.
Environment
- django-mailbox==4.7.1 + Python 3.6.3 + macOS mojave 10.14.6
Description
- when decoding header bytes, util module call python's builtin decode method with the encoding attribute that stores the encoding name as string, but if the python builtin decode method does not support this encoding name, it raises LookupError.
- Since there are no error handlers related to this error, getmail command crashes and we are unable to load subsequent emails.
Stacktrace
- Following is the stack trace I faced.
- The trigger of this exception was a text attachment file that has a Japanese file name, and it looks like it is created in a windows environment.
Traceback (most recent call last):
...
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/management/commands/getmail.py", line 24, in handle
messages = mailbox.get_new_mail()
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 413, in get_new_mail
msg = self.process_incoming_message(message)
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 230, in process_incoming_message
msg = self._process_message(message)
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 368, in _process_message
message = self._get_dehydrated_message(message, msg)
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 258, in _get_dehydrated_message
self._get_dehydrated_message(part, record)
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/models.py", line 284, in _get_dehydrated_message
filename = utils.convert_header_to_unicode(raw_filename)
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/utils.py", line 93, in convert_header_to_unicode
) for bytestr, encoding in email.header.decode_header(header)
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/utils.py", line 93, in <listcomp>
) for bytestr, encoding in email.header.decode_header(header)
File "/Users/user/MyProject/venv/lib/python3.6/site-packages/django_mailbox/utils.py", line 86, in _decode
return value.decode(encoding, 'replace')
LookupError: unknown encoding: windows-31j
The solution I'm thinking.
- Simply try - catch the existing decoding process like this.
try:
return value.decode(encoding, 'replace')
except LookupError:
logger.warning(
'Faced to unknown decoding. Trying to replace decoding with default_charset to avoid a crash,'
' but it will cause garbling.')
return value.decode(default_charset, 'replace')
Since the above approach changes the current behavior, we might need settings to control whether we use this rescue code or not like this.
from django.conf import settings
...
settings = get_settings()
...
except LookupError as e:
if settings['rescue_unknown_encoding']: # we need it to replace appropriate name.
## do rescue operation
else:
raise
I do not much understand the design philosophy of this module, so if you have an any better idea, I'm glad if you let me know.
Thank you. : )
Problem Summary
Environment
Description
Stacktrace
The solution I'm thinking.
Since the above approach changes the current behavior, we might need settings to control whether we use this rescue code or not like this.
I do not much understand the design philosophy of this module, so if you have an any better idea, I'm glad if you let me know.
Thank you. : )