Degraded scaling for the FIFO scheduler

As we discussed on discord, I run a benchmark recently for parallel map on trees. It makes for an interesting embarrassingly parallel (recursive) task. The core function is the `map_par_full` function below:
```ocaml
open Picos

type 'a tree = N of 'a * 'a tree * 'a tree | L

let async e =
  let p = Computation.create () in
  let fiber = Fiber.create ~forbid:false p in
  Fiber.spawn fiber (fun _ -> Computation.return p @@ e ());
  p

let rec map_par_full f t = async @@ fun () -> match t with
  | L -> L
  | N (v , tl , tr ) ->
    let task_tl = map_par_full f tl in
    let task_tr = map_par_full f tr in
    N (f v,
       Computation.await task_tl,
       Computation.await task_tr)
```

I run this with Picos' and Moonpool's schedulers. Here are the results for a tree of size `10000` with medium-sized tasks on a Intel Xeon E5-2630 (so, 12 cores * 2 hyperthreading). Time is normalized to a non-async baseline.

<img width="640" height="480" alt="Image" src="https://github.com/user-attachments/assets/a9a6fe70-e953-4cab-84f4-42dff605c449" />

The two immediate conclusions are that
1. Multithreading is crap on this workload (welp)
2. Workstealing is great, congrats !
3. The scaling for Moonpool's FIFO scheduler is quite poor, and much worse than the multififo scheduler from picos, which should behave similarly.

cc https://github.com/ocaml-multicore/picos/issues/374 on the Picos bug tracker.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Degraded scaling for the FIFO scheduler #38

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Degraded scaling for the FIFO scheduler #38

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions