[WIP] Add IntArray #10
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds an
IntArraymodule, which supports parallel initialisation for Integer Arrays mentioned in #6. It uses @stedolan's prototype which uses strings as the base for IntArray.Benchmarks
I have rewritten the parallel benchmarks
quicksort_multicore.mlandmergesort_multicore.mlfrom sandmark to useDomainslib.IntArrayproposed in this PR along with adding parallel initialisation to them.4.10.0+multicorecompiler variant was used to build and run the benchmarks.Quicksort
n = 10_000_000Observations: The single core version of
IntArrayis faster compared toArrayversion and there is speedup till 8 cores, after which theIntArrayversion slows down.When the number of cores is greater than or equal to 8, there's unusually high GC activity which I suspect explains the slowdown. More specifically, the overheads seem to be at
caml_stw_empty_minor_heapandstw_handler. @stedolan mentioned that strings are not scanned by the GC, I'm not sure if that's someway related to the increased GC activity.This is a part of the eventlog for execution on 24 cores. Most other processes look similar to the ones seen here.
Mergesort
n = 1_000_000Observations: Contrary to the quicksort benchmark, there is no surge in the GC activity in mergesort, which makes me all the more uncertain about the cause of it in the quicksort benchmark. The speedup of the
IntArrayversion independently is quite close to what would be expected expected. But there is a huge slowdown on 1 core compared to theArrayversion. The overheads lie atsetandgetof IntArray (others are present in theArrayversion as well). This is a part ofperf reporton 1 core.Would appreciate any insights on these benchmarks.
To-Do
The module needs addition of more Array functions such as
fill,make,map,iteretc. and possibly some performance tuning. I shall keep updating them to this PR.