Open
Conversation
Contributor
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: 35efee6 | Previous: a9a687c | Ratio |
|---|---|---|---|
array/accumulate/Float32/1d |
101257 ns |
102250.5 ns |
0.99 |
array/accumulate/Float32/dims=1 |
76950 ns |
77483 ns |
0.99 |
array/accumulate/Float32/dims=1L |
1583804 ns |
1593535 ns |
0.99 |
array/accumulate/Float32/dims=2 |
143805.5 ns |
144723 ns |
0.99 |
array/accumulate/Float32/dims=2L |
657210 ns |
661407 ns |
0.99 |
array/accumulate/Int64/1d |
118672 ns |
120091 ns |
0.99 |
array/accumulate/Int64/dims=1 |
80367 ns |
81265.5 ns |
0.99 |
array/accumulate/Int64/dims=1L |
1694731 ns |
1707476 ns |
0.99 |
array/accumulate/Int64/dims=2 |
155985 ns |
158147 ns |
0.99 |
array/accumulate/Int64/dims=2L |
961924 ns |
962833.5 ns |
1.00 |
array/broadcast |
20770 ns |
20861 ns |
1.00 |
array/construct |
1388.2 ns |
1305.3 ns |
1.06 |
array/copy |
18725 ns |
18977 ns |
0.99 |
array/copyto!/cpu_to_gpu |
218074 ns |
218984.5 ns |
1.00 |
array/copyto!/gpu_to_cpu |
283862 ns |
288110 ns |
0.99 |
array/copyto!/gpu_to_gpu |
11475 ns |
11568 ns |
0.99 |
array/iteration/findall/bool |
132338.5 ns |
133871.5 ns |
0.99 |
array/iteration/findall/int |
149890 ns |
150576 ns |
1.00 |
array/iteration/findfirst/bool |
81470.5 ns |
82372 ns |
0.99 |
array/iteration/findfirst/int |
83673 ns |
84621.5 ns |
0.99 |
array/iteration/findmin/1d |
89238 ns |
85921.5 ns |
1.04 |
array/iteration/findmin/2d |
117077 ns |
117665 ns |
1.00 |
array/iteration/logical |
198877 ns |
204118.5 ns |
0.97 |
array/iteration/scalar |
68356 ns |
69670.5 ns |
0.98 |
array/permutedims/2d |
52308 ns |
53206.5 ns |
0.98 |
array/permutedims/3d |
52757.5 ns |
53542.5 ns |
0.99 |
array/permutedims/4d |
52335 ns |
52933.5 ns |
0.99 |
array/random/rand/Float32 |
13095 ns |
13218 ns |
0.99 |
array/random/rand/Int64 |
37392 ns |
34787 ns |
1.07 |
array/random/rand!/Float32 |
8582.333333333334 ns |
8525.333333333334 ns |
1.01 |
array/random/rand!/Int64 |
34425 ns |
34352 ns |
1.00 |
array/random/randn/Float32 |
43847 ns |
38965 ns |
1.13 |
array/random/randn!/Float32 |
31665 ns |
31554 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
35002 ns |
35525 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=1 |
40238 ns |
40453.5 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=1L |
51798 ns |
52015 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2 |
56726 ns |
56943 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
69040 ns |
69861 ns |
0.99 |
array/reductions/mapreduce/Int64/1d |
43445 ns |
43677 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1 |
44863 ns |
42176 ns |
1.06 |
array/reductions/mapreduce/Int64/dims=1L |
87853 ns |
87856 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2 |
59601 ns |
59673 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
85042.5 ns |
84933 ns |
1.00 |
array/reductions/reduce/Float32/1d |
34990 ns |
35822.5 ns |
0.98 |
array/reductions/reduce/Float32/dims=1 |
44183 ns |
42636 ns |
1.04 |
array/reductions/reduce/Float32/dims=1L |
51921.5 ns |
51953 ns |
1.00 |
array/reductions/reduce/Float32/dims=2 |
56857 ns |
57183 ns |
0.99 |
array/reductions/reduce/Float32/dims=2L |
69514 ns |
69983 ns |
0.99 |
array/reductions/reduce/Int64/1d |
43111 ns |
43946 ns |
0.98 |
array/reductions/reduce/Int64/dims=1 |
44815.5 ns |
52864 ns |
0.85 |
array/reductions/reduce/Int64/dims=1L |
87592 ns |
88045 ns |
0.99 |
array/reductions/reduce/Int64/dims=2 |
59547 ns |
59735 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
84652 ns |
84891 ns |
1.00 |
array/reverse/1d |
18693 ns |
18427 ns |
1.01 |
array/reverse/1dL |
69235 ns |
68976.5 ns |
1.00 |
array/reverse/1dL_inplace |
66105 ns |
66026 ns |
1.00 |
array/reverse/1d_inplace |
8618 ns |
10297.333333333334 ns |
0.84 |
array/reverse/2d |
20891 ns |
20813 ns |
1.00 |
array/reverse/2dL |
72885 ns |
72948 ns |
1.00 |
array/reverse/2dL_inplace |
66111 ns |
66137 ns |
1.00 |
array/reverse/2d_inplace |
10379 ns |
11426 ns |
0.91 |
array/sorting/1d |
2735855 ns |
2736028 ns |
1.00 |
array/sorting/2d |
1068748 ns |
1076314.5 ns |
0.99 |
array/sorting/by |
3304911.5 ns |
3305633 ns |
1.00 |
cuda/synchronization/context/auto |
1197.6 ns |
1177.1 ns |
1.02 |
cuda/synchronization/context/blocking |
937.9428571428572 ns |
930.1428571428571 ns |
1.01 |
cuda/synchronization/context/nonblocking |
8403.2 ns |
7222.2 ns |
1.16 |
cuda/synchronization/stream/auto |
1037.9 ns |
1023.8235294117648 ns |
1.01 |
cuda/synchronization/stream/blocking |
788.2277227722773 ns |
829.5588235294117 ns |
0.95 |
cuda/synchronization/stream/nonblocking |
7547.1 ns |
7804.5 ns |
0.97 |
integration/byval/reference |
144068 ns |
144003 ns |
1.00 |
integration/byval/slices=1 |
145857 ns |
145940 ns |
1.00 |
integration/byval/slices=2 |
284769 ns |
284688 ns |
1.00 |
integration/byval/slices=3 |
423136.5 ns |
423167.5 ns |
1.00 |
integration/cudadevrt |
102708 ns |
102604 ns |
1.00 |
integration/volumerhs |
9430527 ns |
9452453.5 ns |
1.00 |
kernel/indexing |
13544.5 ns |
13481 ns |
1.00 |
kernel/indexing_checked |
14227 ns |
14205 ns |
1.00 |
kernel/launch |
2210.5555555555557 ns |
2129 ns |
1.04 |
kernel/occupancy |
663.6863354037267 ns |
660.2115384615385 ns |
1.01 |
kernel/rand |
14679 ns |
18430 ns |
0.80 |
latency/import |
3805555922 ns |
3804590639.5 ns |
1.00 |
latency/precompile |
4587970948 ns |
4586818051.5 ns |
1.00 |
latency/ttfp |
4385584841.5 ns |
4387745362.5 ns |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Needs JuliaGPU/GPUArrays.jl#700 first
cc @gdalle