Skip to content

can't read non-ascii params from sys.argv on Windows #2

@techtonik

Description

@techtonik

On Windows, if you try to pass text in encoding different than ASCII, Python will replace it with question marks.

test2.py

import sys

with open('data.bin', 'wb') as binfile:
  for arg in sys.argv:
    binfile.write(arg)
    binfile.write(' ')
> python test2.py Русский текст
> type data.bin
E:\test2.py ??????? ?????

The solution is to monkeypatch sys.argv on Windows. Based on this recipe it makes sys.argv to be in utf-8:

def win32_utf8_argv():
    """Uses shell32.GetCommandLineArgvW to get sys.argv as a list of Unicode
    strings.

    Versions 2.x of Python don't support Unicode in sys.argv on
    Windows, with the underlying Windows API instead replacing multi-byte
    characters with '?'.
    """

    from ctypes import POINTER, byref, cdll, c_int, windll
    from ctypes.wintypes import LPCWSTR, LPWSTR

    GetCommandLineW = cdll.kernel32.GetCommandLineW
    GetCommandLineW.argtypes = []
    GetCommandLineW.restype = LPCWSTR

    CommandLineToArgvW = windll.shell32.CommandLineToArgvW
    CommandLineToArgvW.argtypes = [LPCWSTR, POINTER(c_int)]
    CommandLineToArgvW.restype = POINTER(LPWSTR)

    cmd = GetCommandLineW()
    argc = c_int(0)
    argv = CommandLineToArgvW(cmd, byref(argc))
    if argc.value > 0:
        # Remove Python executable and commands if present
        start = argc.value - len(sys.argv)
        return [argv[i].encode('utf-8') for i in
                range(start, argc.value)]

if sys.platform == "win32":
    sys.argv = win32_utf8_argv()

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions