Skip to content

Recording time within a Fiber for performance measurement #428

@samdoiron

Description

@samdoiron

When testing performance, it is helpful to be able to measure how much time some code spent doing various types of work. When running only a single Fiber, the difference in Process::CLOCK_THREAD_CPUTIME_ID timestamps tells you the CPU time that the running code consumed. When running multiple Fibers, however, this strategy fails; a call out to I/O will yield to the scheduler, which may then swap to another Fiber that consumes its own CPU time. Once the original Fiber is resumed, the difference in timestamps will include the CPU time for both Fibers. This issue is worse with multiple Fibers than it would be with multiple threads alone because Fibers are not preëmpted, so the added CPU time could be arbitrarily high.

In addition to CPU time, it is often informative to measure the time spent waiting on I/O. Given that the async reactor is (ideally) called each time I/O waiting is performed, Async could provide a measure of I/O wait time in addition to CPU time.

My specific use case for this it to able to run two separate types of requests on a single Falcon web server, and report which of them consumers more CPU time.

An ideal API would look something like this:

start = Async::Instant.now(:microsecond)
# arbitrary code
time_taken = start.elapsed # equivalent to now - start

printf("CPU time: %d µs\n", time_taken.cpu)   # Time spent on the CPU dedicated to this task
printf("I/O: %d µs\n", time_taken.io)         # Time spent waiting on I/O
printf("total: %d µs\n", time_taken.total)    # Total time spent working on the task, subset of wall time

Complexities

One difficulty I can see here is how to handle "child" tasks. For example:

start = Async::Instant.now(:microsecond)
Sync do |parent|
  10.times { parent.async { cpu_heavy_work } }
end
time_taken = clock.elapsed

In a HTTP-server use-case, time_taken should ideally charge the time spent in the child tasks to the parent, since it is "responsible" for them. I'm not sure if the concept of task ownership is a natural one in the system today.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions