Skip to content

MySQL: Try to store 4 byte character in a 3 Byte UTF8-Field #51

@andygrunwald

Description

@andygrunwald

If you execute:

/mlstats --no-report --db-driver 'mysql' --db-hostname 'localhost' --db-user 'root' --db-password '' --db-name 'mlstats' 'http://lists.typo3.org/pipermail/typo3-ug-denmark/'

you will get

Analyzing /Users/agrunwald/.mlstats/compressed/lists.typo3.org/pipermail/typo3-ug-denmark/2006-October.txt.gz
Analyzing /Users/agrunwald/.mlstats/compressed/lists.typo3.org/pipermail/typo3-ug-denmark/2006-November.txt.gz
Analyzing /Users/agrunwald/.mlstats/compressed/lists.typo3.org/pipermail/typo3-ug-denmark/2006-December.txt.gz
Analyzing /Users/agrunwald/.mlstats/compressed/lists.typo3.org/pipermail/typo3-ug-denmark/2007-January.txt.gz
Traceback (most recent call last):
  File "./mlstats", line 38, in <module>
    pymlstats.start()
  File "/Users/agrunwald/Development/MailingListStats.git/pymlstats/__init__.py", line 166, in start
    quiet, force, web_user, web_password)
  File "/Users/agrunwald/Development/MailingListStats.git/pymlstats/main.py", line 173, in __init__
    t, s, np = self.__analyze_mailing_list(mailing_list)
  File "/Users/agrunwald/Development/MailingListStats.git/pymlstats/main.py", line 225, in __analyze_mailing_list
    total, stored, non_parsed = self.__analyze_list_of_files(mailing_list, archives_to_analyze)
  File "/Users/agrunwald/Development/MailingListStats.git/pymlstats/main.py", line 393, in __analyze_list_of_files
    mailing_list.location)
  File "/Users/agrunwald/Development/MailingListStats.git/pymlstats/db/session.py", line 154, in store_messages
    self.insert_people(name, email)
  File "/Users/agrunwald/Development/MailingListStats.git/pymlstats/db/session.py", line 81, in insert_people
    self.session.commit()
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/session.py", line 768, in commit
    self.transaction.commit()
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/session.py", line 370, in commit
    self._prepare_impl()
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/session.py", line 350, in _prepare_impl
    self.session.flush()
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/session.py", line 1907, in flush
    self._flush(objects)
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/session.py", line 2025, in _flush
    transaction.rollback(_capture_exception=True)
  File "/Library/Python/2.7/site-packages/sqlalchemy/util/langhelpers.py", line 57, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/session.py", line 1989, in _flush
    flush_context.execute()
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 371, in execute
    rec.execute(self)
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 524, in execute
    uow
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/persistence.py", line 64, in save_obj
    mapper, table, insert)
  File "/Library/Python/2.7/site-packages/sqlalchemy/orm/persistence.py", line 568, in _emit_insert_statements
    execute(statement, multiparams)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 727, in execute
    return meth(self, multiparams, params)
  File "/Library/Python/2.7/site-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 824, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 954, in _execute_context
    context)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 1116, in _handle_dbapi_exception
    exc_info
  File "/Library/Python/2.7/site-packages/sqlalchemy/util/compat.py", line 189, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 947, in _execute_context
    context)
  File "/Library/Python/2.7/site-packages/sqlalchemy/engine/default.py", line 435, in do_execute
    cursor.execute(statement, parameters)
  File "/Library/Python/2.7/site-packages/MySQLdb/cursors.py", line 205, in execute
    self.errorhandler(self, exc, value)
  File "/Library/Python/2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
sqlalchemy.exc.OperationalError: (OperationalError) (1366, "Incorrect string value: '\\xE6\\xF8\\xE5' for column 'email_address' at row 1") 'INSERT INTO people (email_address, name, username, domain_name, top_level_domain) VALUES (%s, %s, %s, %s, %s)' ('none@none.\xe6\xf8\xe5', 'Lars Bonnesen', 'none', 'none.\xe6\xf8\xe5', '\xe6\xf8\xe5')

This is because you cannot store 4-byte characters in MySQL with the utf-8 character set.
Since MySQL 5.5 4-Byte UTF-8 Unicode Encoding is supported.
See for detail information:

Here is a related django ticket: Use utf8mb4 encoding with MySQL 5.5.

Solution:
Use utf8mb4 charset for MySQL if the used mysql server supports this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions