Skip to content

Conversation

amatsuda
Copy link
Member

Here's another email importer that imports emails from raw text dump instead of the converted "S3 data".

The biggest addition on this version from the previous one is that it retrieves in-reply-to message-ids (that are dropped in the "S3 data") so we can construct email threads.

and let's convert all of them to UTF-8
because so far we're just showing them on HTML, not machine-processing
e.g. ruby-dev 1553, 2320, 2321, 2322, 4361, etc.
for some ruby-dev mails, e.g. 2320
this still warns "Encoding conversion failed code converter not found (ISO-2022-JP-2 to UTF-8)"
when fetching `from` from mail, but it seems like it's working anyway
…odings: UTF-8 and BINARY (ASCII-8BIT)"

on ruby-list: 37565, 38116, 43106
on ruby-list: 41850, 43710
e.g. ruby-dev: 93-108
@amatsuda amatsuda merged commit 8a29b98 into ruby:main Oct 21, 2025
2 checks passed
@amatsuda amatsuda deleted the import_from_raw_mail branch October 21, 2025 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant