Skip to content

non-readline PyOS_StdioReadline when used with PyOS_InputHook fails is buggy with long input #103929

@tacaswell

Description

@tacaswell

Bug report

To get this bug you must:

  • not be using the readline based stdin code path
  • must have a GUI toolkit with install as PyOS_InputHook imported
  • try to enter stings into input that are 99 characters or longer.
  • stdin is in line buffered mode

The source of the problem is that in

while (1) {
if (PyOS_InputHook != NULL) {
(void)(PyOS_InputHook)();
}
errno = 0;
clearerr(fp);
char *p = fgets(buf, len, fp);
if (p != NULL) {
return 0; /* No error */
}
int err = errno;

which is called from

cpython/Parser/myreadline.c

Lines 304 to 322 in 385d8d2

do {
size_t incr = (n > 0) ? n + 2 : 100;
if (incr > INT_MAX) {
PyMem_RawFree(p);
PyEval_RestoreThread(tstate);
PyErr_SetString(PyExc_OverflowError, "input line too long");
PyEval_SaveThread();
return NULL;
}
pr = (char *)PyMem_RawRealloc(p, n + incr);
if (pr == NULL) {
PyMem_RawFree(p);
PyEval_RestoreThread(tstate);
PyErr_NoMemory();
PyEval_SaveThread();
return NULL;
}
p = pr;
int err = my_fgets(tstate, p + n, (int)incr, sys_stdin);

We get the following sequence of events:

  • user calls input
  • my_fgets calls the input hook which blocks until stdin reports ready to read
  • fgets reads up to the first 99 characters and my_fgets returns
  • if the input is longer than 99 (including the new line), then the last read character will not be the newline and the calling loop will call my_fgets again
  • The input hook will be called again, but because there is no new user input (just remaining characters from the last pass) the inputhook blocks.
  • if the user hits enter a second time the inputhook will return and fgets will read to the original new line
  • the second new line will still be in the stdin buffer and will come out immediately the next time input is called.

Possible flaws in my understanding:

  • I am not clear why the next input call immediately rather than getting stuck in the input hook
  • entries that are multiples of 100 do not require multiple extra enters

This script demonstrates the problem:

import string
import sys
from tkinter import Tk
from tkinter import ttk


def run():
    """
    This sets up a minimal tk application that has enough functionality
    to verify it is "live" and the inputhook is running while waiting for
    user input.
    """
    root = Tk()
    frm = ttk.Frame(root, padding=10)
    frm.grid()
    lbl = ttk.Label(frm, text="push count = 0")
    lbl.grid(column=0, row=0)

    j = 0

    def set_label():
        nonlocal j
        j += 1
        lbl["text"] = f"push count = {j}"

    ttk.Button(frm, text="Push me!", command=set_label).grid(column=1, row=0)

    return root, frm


run()

print("This is a demo of a bug in the non-readline based stdio code\n\n")
print(f"The readline module is not loaded: {'readline' in sys.modules=}")

test_string = (string.ascii_lowercase + string.ascii_uppercase) * 2

print(
    f"""
We are using the test string :

\t{test_string}

as it is easy to eyeball the length (it is 104 characters long in 26
character blocks).

You should see a tk window with a button that says "Push me!" and a counter.
Pushing the button should increment the counter.

Follow the instructions to demonstrate the bug.


"""
)

print(
    f"""
To see a case where it works paste

{test_string[:10]}

into the prompt below.  Before hitting return, try pushing
the button on the UI to verify that the inputhook is running.
"""
)

a = input("paste here >> ")
print(f"You pasted {a}")


print(
    f"""
To see it fail past the full string

{test_string}

into the prompt below (you will have to hit enter twice)
"""
)

a = input("paste here >> ")
print(f"You pasted {a}")

print("There is still a newline in the buffer, this input will be 'skipped'\n")
a = input("you can not input here >> ")
print(f"we got an empty string!: {a=!r} (also note no new line in stdout)")


print(
    f"""

you can now play with it or ctrl-d to exit.

The longest string that works is (98 letters + new line):

{test_string[:98]}

"""
)

while True:
    print("\n")
    a = input("test input >> ")
    print(f"what you entered: {a=}")

This needs to be run as python demo.py not pasted into a shell because the code paths that rely on readline work correctly. Running as python -uu demo.py also works correctly.

Your environment

  • CPython versions tested on: 3.10.10, 3.11.3, 3.9+
  • Operating system and architecture: (arch) linux x86, OSX

This was originally reported via matplotlib/matplotlib#25756 where you can see my notes as I sorted this out.

Based on the code paths I expect this to not be reproducible on Windows.

I think this bug goes back to at least 717c6f9 so I expect all currently supported versions of Python to be affected.

I will shortly open a PR with a proposed fix.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.10only security fixes3.11only security fixes3.12only security fixestype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions