-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathglobchardet.py
More file actions
18 lines (16 loc) · 803 Bytes
/
globchardet.py
File metadata and controls
18 lines (16 loc) · 803 Bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
## really hacky python script to estimate the encoding of all the files in your folder with chardet and a tqdm loadbar: run at your own risk
## meant to be used while piping the output to a text file from your shell if you're working on a massive amount of files e.g.: python globchardet.py >> output.txt
## coincidentally shows a good example how to add a tqdm progress bar to your script
## WTFPL by valahraban
import sys
import glob
import chardet
from tqdm import tqdm
for filename in tqdm(glob.glob("**/*.*", recursive=True)):
with open(filename, 'rb') as f:
out_name = filename
rawdata = f.read()
result = chardet.detect(rawdata)
charenc = result['encoding']
print("Encoding of " + str(filename) + " is " + str(charenc))
input("press return to exit")