Skip to content

Unicode Error #7

@hangerjj

Description

@hangerjj

Has anyone seen this error before? The file has hundreds of thousands of games but I'm getting a Unicode error when running pgn2data. What I've tired so far is the following. Before I manually look at the pgn file with Scid, any other ideas what could be causing this?

iconv-Linux tool to change the encoding but it fails.

pgn-extract-A pgn command line tool to clean pgn files. Still getting the Unicode error.

I thought about creating a python script in attempt to change the encoding but the solutions I researched were working with the read.csv tool in pandas so I thought that'd be incorrect

OS is Debian 11 Bullseye

Example of error. Had multiple position #'s.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x94 in position 5478: invalid start byte

Example of iconv error. Sequence position has varied.

iconv: illegal input sequence at position 2635110

Example of the code I was using.


from converter.pgn_data import PGNData

pgn_data = PGNData("multiplegames.pgn")
pgn_data.export()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions