-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
On Windows, if you try to pass text in encoding different than ASCII, Python will replace it with question marks.
test2.py
import sys
with open('data.bin', 'wb') as binfile:
for arg in sys.argv:
binfile.write(arg)
binfile.write(' ')
> python test2.py Русский текст
> type data.bin
E:\test2.py ??????? ?????
The solution is to monkeypatch sys.argv on Windows. Based on this recipe it makes sys.argv to be in utf-8:
def win32_utf8_argv():
"""Uses shell32.GetCommandLineArgvW to get sys.argv as a list of Unicode
strings.
Versions 2.x of Python don't support Unicode in sys.argv on
Windows, with the underlying Windows API instead replacing multi-byte
characters with '?'.
"""
from ctypes import POINTER, byref, cdll, c_int, windll
from ctypes.wintypes import LPCWSTR, LPWSTR
GetCommandLineW = cdll.kernel32.GetCommandLineW
GetCommandLineW.argtypes = []
GetCommandLineW.restype = LPCWSTR
CommandLineToArgvW = windll.shell32.CommandLineToArgvW
CommandLineToArgvW.argtypes = [LPCWSTR, POINTER(c_int)]
CommandLineToArgvW.restype = POINTER(LPWSTR)
cmd = GetCommandLineW()
argc = c_int(0)
argv = CommandLineToArgvW(cmd, byref(argc))
if argc.value > 0:
# Remove Python executable and commands if present
start = argc.value - len(sys.argv)
return [argv[i].encode('utf-8') for i in
range(start, argc.value)]
if sys.platform == "win32":
sys.argv = win32_utf8_argv()Reactions are currently unavailable