Skip to content
4 changes: 4 additions & 0 deletions lib/posix/spawn.rb
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,10 @@ class MaximumOutputExceeded < StandardError
class TimeoutExceeded < StandardError
end

# Exception raised when output streaming is aborted early.
class Aborted < StandardError
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could maybe be CallerAborted or something more specific?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with either.

end

private

# Turns the various varargs incantations supported by Process::spawn into a
Expand Down
32 changes: 30 additions & 2 deletions lib/posix/spawn/child.rb
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,13 @@ def initialize(*args)
@options[:pgroup] = true
end
@options.delete(:chdir) if @options[:chdir].nil?
@streaming = false
if streams = @options.delete(:streams)
@stdout_block = streams[:stdout]
@stderr_block = streams[:stderr]

@streaming = !!@stdout_block || !!@stderr_block
end
exec! if !@options.delete(:noexec)
end

Expand Down Expand Up @@ -244,14 +251,35 @@ def read_and_write(input, stdin, stdout, stderr, timeout=nil, max=nil)

# read from stdout and stderr streams
ready[0].each do |fd|
buf = (fd == stdout) ? @out : @err
chunk = nil
begin
buf << fd.readpartial(BUFSIZE)
chunk = fd.readpartial(BUFSIZE)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding this right, the old code would always append directly into the final buffer, whereas this one reads a chunk and then appends that chunk to the buffer. Not knowing anything about how Ruby operates under the hood, is this a potential performance problem? It should just be an extra memcpy in the worst case, but I recall that we've hit bottlenecks on reading into Ruby before. I suspect those were mostly about arrays and not buffers, though (e.g., reading each line into its own buffer can be slow).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong here (cc @tenderlove) but I'm pretty sure the previous code would actually create a new string (down inside readpartial), then that string would have been appended to buf. Which would require potentially resizing that buffer first, then the memcpy.

This new code just keeps that first string as a local var first, so we can later determine where to write it. In the default case we're using a StringIO so the result is essentially the same as before (potential buffer resize then a copy). Though iirc we saw some pretty significant speedups by using an array to keep track of chunks, then calling join at the end when it was all needed. The reason for that is because it avoids the reallocation of the resulting buffer as we're reading, and instead allows join to allocate a buffer exactly the size that's needed then copying all the chunks in to it.

Basically this (the old way):

buffer = ""
buffer << "one,"
buffer << "two,"
buffer << "three"

return buffer

vs this (the array optimized version I just mentioned):

buffer = []
buffer << "one,"
buffer << "two,"
buffer << "three"

# buffer is an array with 3 elements at this point
# and this join call figures out how big all of the strings inside are, then creates a single buffer to append them to.
return buffer.join

Using that approach efficiently may change the API contract here slightly though...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brianmario Ah, right, that sort of return value optimization would be pretty easy to implement, and would mean we end up with the same number of copies. Though if we're just counting memcpys anyway, I suspect it doesn't matter much either way.

The reason for that is because it avoids the reallocation of the resulting buffer as we're reading, and instead allows join to allocate a buffer exactly the size that's needed then copying all the chunks in to it.

Interesting. It sounds like appending doesn't grow the buffer aggressively in that case, because you should be able to get amortized constant-time.

Anyway. We're well out of my level for intelligent discussion of Ruby internals. The interesting result is whether reading the output of a spawned cat some-gigabyte-file is measurably any different. Probably not, but it's presumably easy to test.

rescue Errno::EAGAIN, Errno::EINTR
rescue EOFError
readers.delete(fd)
fd.close
end

abort = false
if chunk
if fd == stdout
if @streaming && @stdout_block
abort = !!@stdout_block.call(chunk)
else
@out << chunk
end
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to just drop returning these in an attempt at consistency since these one or both of these ivars are useless if we're streaming.

else
if @streaming && @stderr_block
abort = !!@stderr_block.call(chunk)
else
@err << chunk
end
end
end
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole block is pretty gross, but the alternative may involve being tricky (and less readable?) with Ruby.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the "tricky" Ruby you're talking about, but it seems to me that when streaming is not requested, you could set @stdout_block anyway, to

Proc.new do |chunk|
  @out << chunk
  false
end

(like you do for the tests) and do the equivalent for @stderr_block. Then you could avoid the inner conditionals here.

To shrink the code even further, you could set

@blocks = { stdout => @stdout_block, stderr => @stderr_block }

(in which case you wouldn't even need @stdout_block and @stderr_block anymore, but you get the idea) then this whole processing code could become

if @blocks[fd].call(chunk)
  raise Aborted
end


if @streaming && abort
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @streaming && is redundant here, since when @streaming is not set, abort retains its initial value, false.

raise Aborted
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love raising here, but it enforces proper cleanup (and killing the subprocess) up on

rescue Object
.

end
end

# keep tabs on the total amount of time we've spent here
Expand Down
59 changes: 59 additions & 0 deletions test/test_child.rb
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,65 @@ def test_utf8_input_long
assert p.success?
end

def test_streaming_stdout
stdout_buf = ""
stdout_stream = Proc.new do |chunk|
stdout_buf << chunk

false
end

input = "hello!"
p = Child.new('cat', :input => input, :streams => {
:stdout => stdout_stream
})

assert p.success?
assert_equal input, stdout_buf
end

def test_streaming_stderr
stderr_buf = ""
stderr_stream = Proc.new do |chunk|
stderr_buf << chunk

false
end

p = Child.new('ls', '-?', :streams => {
:stderr => stderr_stream
})

refute p.success?
refute stderr_buf.empty?
end

def test_streaming_stdout_aborted
stdout_stream = Proc.new do |chunk|
true
end

input = "hello!"
assert_raises POSIX::Spawn::Aborted do
p = Child.new('cat', :input => input, :streams => {
:stdout => stdout_stream
})
end
end

def test_streaming_stderr_aborted
stderr_stream = Proc.new do |chunk|
true
end

input = "hello!"
assert_raises POSIX::Spawn::Aborted do
p = Child.new('ls', '-?', :streams => {
:stderr => stderr_stream
})
end
end

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no tests that involve reading more than one BUFSIZE worth of output, or reading from both stdout and stderr. Those might be worthwhile additions.

Copy link

@mclark mclark Jan 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add a test for passing in a minimal custom object, just to ensure the interface contract is maintained. May be a good time to use a spy.
No longer applicable as we are using Procs now.

##
# Assertion Helpers

Expand Down