Skip to content

subprocess.run("clip", ...) gives me a wrong the wrong output in the wrong language #126873

@JulianDominic

Description

@JulianDominic

Bug report

Bug description:

I wrote the following code to convert a multi-line string into a single-line string.

It works as expected with other strings except for the following is_anagram C function.

import subprocess 

def split_newlines(string:str):
    return string.split("\n")

def get_num_tabs(string:str):
    count = 0
    tabs = 0
    for i in range(len(string)):
        if string[i] == ' ':
            count += 1
            if count % 4 == 0:
                tabs += 1
        else:
            break
    return tabs

raw =\
"""\
bool is_anagram(char *s1, char *s2) {
    // If they aren't the same length, they are clearly not anagrams
    if (strlen(s1) != strlen(s2)) {
        return false;
    }

    // Use two arrays to count the number of characters (either ALL lower or ALL upper)
    int arr1[26] = {0};
    int arr2[26] = {0};

    while (*s1 != '\0') {
        if (isalpha(*s1) && islower(*s1)) {
            arr1[*s1 - 'a'] += 1;
        }
        s1++;
    }
    while (*s2 != '\0') {
        if (isalpha(*s2) && islower(*s2)) {
            arr2[*s2 - 'a'] += 1;
        }
        s2++;
    }

    // If they have the same count, then they are the same
    for (int i = 0; i < 26; i++) {
        if (arr1[i] != arr2[i]) {
            return false;
        }
    }
    return true;
}\
"""

s = split_newlines(raw)
lines = [sen.strip() for sen in s]
tabs = [get_num_tabs(w) for w in s]
result = ""
for line, tab in zip(lines, tabs):
    result += r'\t' * tab + line + r'\n'
print(result)
subprocess.run("clip", text=True, input=result)

print(result) actually gives me the correct result of

bool is_anagram(char *s1, char *s2) {\n\t// If they aren't the same length, they are clearly not anagrams\n\tif (strlen(s1) != strlen(s2)) {\n\t\treturn false;\n\t}\n\n\t// Use two arrays to count the number of characters (either ALL lower or ALL upper)\n\tint arr1[26] = {0};\n\tint arr2[26] = {0};\n\n\twhile (*s1 != '') {\n\t\tif (isalpha(*s1) && islower(*s1)) {\n\t\t\tarr1[*s1 - 'a'] += 1;\n\t\t}\n\t\ts1++;\n\t}\n\twhile (*s2 != '') {\n\t\tif (isalpha(*s2) && islower(*s2)) {\n\t\t\tarr2[*s2 - 'a'] += 1;\n\t\t}\n\t\ts2++;\n\t}\n\n\t// If they have the same count, then they are the same\n\tfor (int i = 0; i < 26; i++) {\n\t\tif (arr1[i] != arr2[i]) {\n\t\t\treturn false;\n\t\t}\n\t}\n\treturn true;\n}\n

However, what gets copied into my clipboard is some Chinese text with other symbols

潢汯椠彳湡条慲⡭档牡⨠ㅳ‬档牡⨠㉳
屻屮⽴ 晉琠敨⁹牡湥琧琠敨猠浡⁥敬杮桴‬桴祥愠敲挠敬牡祬渠瑯愠慮牧浡屳屮楴⁦猨牴敬⡮ㅳ
㴡猠牴敬⡮㉳⤩笠湜瑜瑜敲畴湲映污敳尻屮絴湜湜瑜⼯唠敳琠潷愠牲祡⁳潴挠畯瑮琠敨渠浵敢⁲景挠慨慲瑣牥⁳攨瑩敨⁲䱁⁌潬敷⁲牯䄠䱌甠灰牥尩屮楴瑮愠牲嬱㘲⁝‽ほ㭽湜瑜湩⁴牡㉲㉛崶㴠笠細尻屮屮睴楨敬⠠猪‱㴡✠✀
屻屮屴楴⁦椨慳灬慨⨨ㅳ
☦椠汳睯牥⨨ㅳ⤩笠湜瑜瑜瑜牡ㅲ⩛ㅳⴠ✠❡⁝㴫ㄠ尻屮屴絴湜瑜瑜ㅳ⬫尻屮絴湜瑜桷汩⁥⨨㉳℠‽'⤧笠湜瑜瑜晩⠠獩污桰⡡猪⤲☠…獩潬敷⡲猪⤲
屻屮屴屴慴牲嬲猪′‭愧崧⬠‽㬱湜瑜瑜屽屮屴獴⬲㬫湜瑜屽屮屮⽴ 晉琠敨⁹慨敶琠敨猠浡⁥潣湵ⱴ琠敨桴祥愠敲琠敨猠浡履屮晴牯⠠湩⁴⁩‽㬰椠㰠㈠㬶椠⬫
屻屮屴楴⁦愨牲嬱嵩℠‽牡㉲楛⥝笠湜瑜瑜瑜敲畴湲映污敳尻屮屴絴湜瑜屽屮牴瑥牵牴敵尻絮湜

I have tested this on Python 3.13.0 and Python 3.12.5

I have tested this on both Windows 10 and Windows 11.

I assume this is a Windows-only issue because when tried to run it on WSL Ubuntu 24.04, it gave me an error: FileNotFoundError: [Errno 2] No such file or directory: 'clip'

I forgot to mention, I do not have any Chinese keyboards installed. Only English (United States) US Keyboard, and English (Singapore) US Keyboard.

CPython versions tested on:

3.12, 3.13

Operating systems tested on:

Windows

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions