-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
As we discussed on discord, I run a benchmark recently for parallel map on trees. It makes for an interesting embarrassingly parallel (recursive) task. The core function is the map_par_full function below:
open Picos
type 'a tree = N of 'a * 'a tree * 'a tree | L
let async e =
let p = Computation.create () in
let fiber = Fiber.create ~forbid:false p in
Fiber.spawn fiber (fun _ -> Computation.return p @@ e ());
p
let rec map_par_full f t = async @@ fun () -> match t with
| L -> L
| N (v , tl , tr ) ->
let task_tl = map_par_full f tl in
let task_tr = map_par_full f tr in
N (f v,
Computation.await task_tl,
Computation.await task_tr)I run this with Picos' and Moonpool's schedulers. Here are the results for a tree of size 10000 with medium-sized tasks on a Intel Xeon E5-2630 (so, 12 cores * 2 hyperthreading). Time is normalized to a non-async baseline.
The two immediate conclusions are that
- Multithreading is crap on this workload (welp)
- Workstealing is great, congrats !
- The scaling for Moonpool's FIFO scheduler is quite poor, and much worse than the multififo scheduler from picos, which should behave similarly.
cc ocaml-multicore/picos#374 on the Picos bug tracker.
mndrix
Metadata
Metadata
Assignees
Labels
No labels