-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Description
This was my command python photon.py -u "https://en.wikipedia.org/wiki/Tom_Crean_(explorer)" -l 2
and this was the output:
____ __ __
/ __ \/ /_ ____ / /_____ ____
/ /_/ / __ \/ __ \/ __/ __ \/ __ \
/ ____/ / / / /_/ / /_/ /_/ / / / /
/_/ /_/ /_/\____/\__/\____/_/ /_/ v1.3.2
Level 1: 1 URLs
Progress: 1/1
Level 2: 478 URLs
Progress: 478/478
Crawling 1 JavaScript files
Progress: 1/1
Traceback (most recent call last):
File "photon.py", line 385, in <module>
writer(datasets, dataset_names, output_dir)
File "C:\Users\Tejaswa\Documents\GitHub\Photon\core\utils.py", line 85, in writer
out_file.write(str(joined.encode('utf-8').decode('utf-8')))
File "C:\Python38\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0142' in position 17758: character maps to <undefined>
I (think I) added a mapping in lib\encodings\cp1252.py by doing:
.
.
'\xff' # 0xFF -> LATIN SMALL LETTER Y WITH DIAERESIS
'\u0142' # 0xFF -> LATIN SMALL LETTER L WITH DIAERESIS
)
### Encoding table
encoding_table=codecs.charmap_build(decoding_table)
But I doubt this is correct (the hex values are maxed out at \xff too)
Is there any parameter to ignore such encoding problems that I can specify with photon itself? Or some underlying file to edit?
Thanks
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels