Error message:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 147: invalid continuation byte
I can't train properly after loading these two data sets. Still report an error after using "ISO-8859-1" and "latin-1" code
After checking the train.txt file of the MRPC dataset, I found that the error byte code corresponds to the character "é", but I modified train.txt and test.txt and preprocessed again to get train.tsv and test.tsv (the file also checked that it did not contain the character "é"). Finally, I still reported an error in training.
Error message:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 147: invalid continuation byte
I can't train properly after loading these two data sets. Still report an error after using "ISO-8859-1" and "latin-1" code
After checking the train.txt file of the MRPC dataset, I found that the error byte code corresponds to the character "é", but I modified train.txt and test.txt and preprocessed again to get train.tsv and test.tsv (the file also checked that it did not contain the character "é"). Finally, I still reported an error in training.