Open
Conversation
maleadt
approved these changes
Apr 1, 2026
Member
maleadt
left a comment
There was a problem hiding this comment.
Doesn't it need an additional empty line too?
Contributor
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: 3151c6c | Previous: 0b05451 | Ratio |
|---|---|---|---|
latency/precompile |
44931145637 ns |
4612556023 ns |
9.74 |
latency/ttfp |
12943499837 ns |
4408562143 ns |
2.94 |
latency/import |
3563511009 ns |
3821839641 ns |
0.93 |
integration/volumerhs |
9440699.5 ns |
9448669.5 ns |
1.00 |
integration/byval/slices=1 |
146095 ns |
145839 ns |
1.00 |
integration/byval/slices=3 |
423324 ns |
423425 ns |
1.00 |
integration/byval/reference |
144115 ns |
144031 ns |
1.00 |
integration/byval/slices=2 |
285137 ns |
284559 ns |
1.00 |
integration/cudadevrt |
102657.5 ns |
102661 ns |
1.00 |
kernel/indexing |
13221 ns |
13693 ns |
0.97 |
kernel/indexing_checked |
13979 ns |
14309 ns |
0.98 |
kernel/occupancy |
668.2151898734177 ns |
663.8086419753087 ns |
1.01 |
kernel/launch |
2156.3333333333335 ns |
2210.6666666666665 ns |
0.98 |
kernel/rand |
14582 ns |
16005 ns |
0.91 |
array/reverse/1d |
18785 ns |
18568 ns |
1.01 |
array/reverse/2dL_inplace |
66191 ns |
66179 ns |
1.00 |
array/reverse/1dL |
69282 ns |
69124 ns |
1.00 |
array/reverse/2d |
21586 ns |
21081 ns |
1.02 |
array/reverse/1d_inplace |
10600.666666666666 ns |
8671.333333333334 ns |
1.22 |
array/reverse/2d_inplace |
10583 ns |
10405 ns |
1.02 |
array/reverse/2dL |
73490 ns |
73273 ns |
1.00 |
array/reverse/1dL_inplace |
66213 ns |
66158 ns |
1.00 |
array/copy |
18706 ns |
18801 ns |
0.99 |
array/iteration/findall/int |
149764 ns |
149677 ns |
1.00 |
array/iteration/findall/bool |
132247 ns |
132815 ns |
1.00 |
array/iteration/findfirst/int |
84317 ns |
84077 ns |
1.00 |
array/iteration/findfirst/bool |
82632 ns |
81834 ns |
1.01 |
array/iteration/scalar |
67579 ns |
67842 ns |
1.00 |
array/iteration/logical |
201911 ns |
205668 ns |
0.98 |
array/iteration/findmin/1d |
89632 ns |
88143.5 ns |
1.02 |
array/iteration/findmin/2d |
117317 ns |
117602 ns |
1.00 |
array/reductions/reduce/Int64/1d |
43074 ns |
43879 ns |
0.98 |
array/reductions/reduce/Int64/dims=1 |
42319 ns |
42363.5 ns |
1.00 |
array/reductions/reduce/Int64/dims=2 |
59769 ns |
59729 ns |
1.00 |
array/reductions/reduce/Int64/dims=1L |
87694 ns |
87783 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
84746 ns |
84600.5 ns |
1.00 |
array/reductions/reduce/Float32/1d |
34775.5 ns |
35554 ns |
0.98 |
array/reductions/reduce/Float32/dims=1 |
39790 ns |
40471.5 ns |
0.98 |
array/reductions/reduce/Float32/dims=2 |
57018 ns |
57348 ns |
0.99 |
array/reductions/reduce/Float32/dims=1L |
51911 ns |
52155 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
69921 ns |
70051 ns |
1.00 |
array/reductions/mapreduce/Int64/1d |
42828 ns |
43658 ns |
0.98 |
array/reductions/mapreduce/Int64/dims=1 |
42715 ns |
42850 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2 |
59449 ns |
60283 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1L |
87721 ns |
87856 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
84538 ns |
84886 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
34528 ns |
35439 ns |
0.97 |
array/reductions/mapreduce/Float32/dims=1 |
40228 ns |
49602 ns |
0.81 |
array/reductions/mapreduce/Float32/dims=2 |
56891 ns |
57112 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1L |
51793 ns |
52066 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=2L |
69287 ns |
70220.5 ns |
0.99 |
array/broadcast |
20728 ns |
20835 ns |
0.99 |
array/copyto!/gpu_to_gpu |
11282 ns |
11409 ns |
0.99 |
array/copyto!/cpu_to_gpu |
215496.5 ns |
217482 ns |
0.99 |
array/copyto!/gpu_to_cpu |
283460.5 ns |
284547 ns |
1.00 |
array/accumulate/Int64/1d |
118849 ns |
119536 ns |
0.99 |
array/accumulate/Int64/dims=1 |
79803 ns |
80581.5 ns |
0.99 |
array/accumulate/Int64/dims=2 |
155916 ns |
156671 ns |
1.00 |
array/accumulate/Int64/dims=1L |
1694625.5 ns |
1706437 ns |
0.99 |
array/accumulate/Int64/dims=2L |
961590 ns |
962104 ns |
1.00 |
array/accumulate/Float32/1d |
101671 ns |
102057 ns |
1.00 |
array/accumulate/Float32/dims=1 |
76866 ns |
77830 ns |
0.99 |
array/accumulate/Float32/dims=2 |
143826 ns |
144868 ns |
0.99 |
array/accumulate/Float32/dims=1L |
1585623 ns |
1586085 ns |
1.00 |
array/accumulate/Float32/dims=2L |
657591 ns |
661758 ns |
0.99 |
array/construct |
1291.55 ns |
1349.1 ns |
0.96 |
array/random/randn/Float32 |
38465 ns |
42622 ns |
0.90 |
array/random/randn!/Float32 |
31529 ns |
31841 ns |
0.99 |
array/random/rand!/Int64 |
34185 ns |
34412 ns |
0.99 |
array/random/rand!/Float32 |
8498.666666666666 ns |
8601 ns |
0.99 |
array/random/rand/Int64 |
37008.5 ns |
37538 ns |
0.99 |
array/random/rand/Float32 |
13002.5 ns |
13018 ns |
1.00 |
array/permutedims/4d |
52636.5 ns |
52619.5 ns |
1.00 |
array/permutedims/2d |
52787 ns |
52940 ns |
1.00 |
array/permutedims/3d |
52988 ns |
53310 ns |
0.99 |
array/sorting/1d |
2735841 ns |
2735465.5 ns |
1.00 |
array/sorting/by |
3304400 ns |
3315555 ns |
1.00 |
array/sorting/2d |
1069201 ns |
1072718 ns |
1.00 |
cuda/synchronization/stream/auto |
1036.6470588235295 ns |
1042.1 ns |
0.99 |
cuda/synchronization/stream/nonblocking |
7682 ns |
7867.299999999999 ns |
0.98 |
cuda/synchronization/stream/blocking |
836.8019801980198 ns |
828.4299065420561 ns |
1.01 |
cuda/synchronization/context/auto |
1203 ns |
1198.8 ns |
1.00 |
cuda/synchronization/context/nonblocking |
7482.1 ns |
7295 ns |
1.03 |
cuda/synchronization/context/blocking |
954.5142857142857 ns |
929.6410256410256 ns |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.