Adaptive min/max/prod algo=fast + vectorized input is most likely ending up in nested parallelism.
It is because those three functions redirects to their algo=exact versions, which are already parallelized.
This must be avoided.
Possible solutions could be:
- redirect early, before parallel loop on vectorized input in frollR.c - keeps inner parallelism
- disable nested parallelism using flag (as done for roll median) passed to frolladaptive.c - keeps outer parallelism