Skip to content

System.Cmd.Utils.pipeBoth subject to pipeline stalls with ghc 7 #7

@nomeata

Description

@nomeata

Hi John,

this is to track the bug reported by Joey at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=624389 here, to ensure it is not forgotten. For your convenience, here is the original report:

For quite a while I have been using missingh's pipeBoth with success; but as
soon as my program was rebuilt with ghc 7, it started stalling when large
quantities of data needed to be passed through the pipe.

Here is a simple test case. It needs to run in a git repository.

import System.Cmd.Utils

main = do
        as <- checkAttr "blah" $ map show [1..100000]
        sequence $ map (putStrLn . show) as

checkAttr attr files = do
        (_, s) <- pipeBoth "git" params $ unlines files
        return $ lines s
        where
                params = ["check-attr", attr, "--stdin"]

It queries git for attribute values for 100000 files. With ghc 6, it
should run to completion. With ghc 7, it stalls, deadlocked, after
a varying number of files, under 1000:

select(2, [], [1], NULL, {0, 0})        = 1 (out [1], left {0, 0})
write(1, "\"701: blah: unspecified\"\n", 25"701: blah: unspecified") = 25
select(2, [], [1], NULL, {0, 0})        = 1 (out [1], left {0, 0})
write(1, "\"702: blah: unspecified\"\n", 25"702: blah: unspecified") = 25
select(2, [], [1], NULL, {0, 0})        = 1 (out [1], left {0, 0})
write(1, "\"703: blah: unspecified\"\n", 25"703: blah: unspecified") = 25
select(2, [], [1], NULL, {0, 0})        = 1 (out [1], left {0, 0})
write(1, "\"704: blah: unspecified\"\n", 25"704: blah: unspecified") = 25
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
gettimeofday({1303958431, 174266}, NULL) = 0
select(7, [], [6], NULL, {0, 0})        = 1 (out [6], left {0, 0})
write(6, "345\n15346\n15347\n15348\n15349\n1535"..., 8096

The program is blocked trying to write to git-check-attr, and
git-check-attr is in turn blocked waiting for its output to be read.

I've skipping over missingh and filing this bug directly on ghc because
I think it's unlikely missingh is at fault. IIRC, pipeBoth works by
sparking off a helper thread, which is used to write input to a command.
Unless it made a bad assumption about that being a safe thing to do,
this must be a bug in GHC?

FWIW, I have worked around this in my code by forking a process, not a
thread, to do the writing. Which works fine, just a little more
heavyweight than needed. I'm concerned about all the other potential
callers of pipeBoth out there, however.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions