Skip to content

asyncio's Process.communicate() is unsafe to cancel #139373

@uhx

Description

@uhx

Bug report

Bug description:

Documentation on asyncio.subprocess.Process (link) says:

the communicate() and wait() methods don’t have a timeout parameter: use the wait_for() function;

but in the meantime, cancelling this Future may result in stdout/stderr loss, which might be unclear from the documentation and not very user-friendly if the user ever retries communicating with the process again to collect the whole output.

So in general, using the synchronous API I would do it like this:

import sys
import subprocess


process = subprocess.Popen(
    args=[
        sys.executable,
        '-c',
        'import time; print("first", flush=True); time.sleep(3); print("second")',
    ],
    stdout=subprocess.PIPE)
try:
    [stdout, _] = process.communicate(timeout=1)
except subprocess.TimeoutExpired:
    process.kill()
    [stdout, _] = process.communicate()
    print(f"Timed out, {stdout=}")

output:

Timed out, stdout=b'first\r\n'

which gives me the full process output. I think this is a common approach to using the subprocess API.

Doing the same in async will mostly lead to output loss:

import asyncio
import sys

async def main():
    process = await asyncio.create_subprocess_exec(
        sys.executable,
        '-c',
        'import time; print("first", flush=True); time.sleep(3); print("second")',
        stdout=asyncio.subprocess.PIPE,
        )
    try:
        [stdout, _] = await asyncio.wait_for(process.communicate(), timeout=1)
    except asyncio.TimeoutError:
        process.kill()
        [stdout, _] = await process.communicate()
        print(f"Timed out, {stdout=}")


if __name__ == '__main__':
    asyncio.run(main())

output:

Timed out, stdout=b''

I researched the source code a bit and don't see a clear solution for this other than implementing a timeout parameter for Process.communicate(). Cancellation may occur right between those two calls:

stdin, stdout, stderr = await tasks.gather(stdin, stdout, stderr)
await self.wait()

so the transport buffer will already be drained, but the process is not yet .wait()ed for.

PS I don't know if I chose the issue type correctly, since it might be a bug, a documentation issue, or an enhancement proposal.

CPython versions tested on:

3.11, 3.13

Operating systems tested on:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytopic-asynciotype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions