Skip to content

Conversation

@maleadt
Copy link
Member

@maleadt maleadt commented Jan 2, 2026

No description provided.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: 040bb83 Previous: 5d9474a Ratio
latency/precompile 55311483455.5 ns 55510377029.5 ns 1.00
latency/ttfp 7787049630.5 ns 7790703567 ns 1.00
latency/import 4121978496 ns 4122189304 ns 1.00
integration/volumerhs 9619491.5 ns 9624973 ns 1.00
integration/byval/slices=1 146989 ns 147064 ns 1.00
integration/byval/slices=3 426286 ns 425893 ns 1.00
integration/byval/reference 145185 ns 145082 ns 1.00
integration/byval/slices=2 286472 ns 286384 ns 1.00
integration/cudadevrt 103599 ns 103602 ns 1.00
kernel/indexing 14416 ns 14225 ns 1.01
kernel/indexing_checked 15088 ns 14969 ns 1.01
kernel/occupancy 695 ns 732.5227272727273 ns 0.95
kernel/launch 2184.4444444444443 ns 2249.4444444444443 ns 0.97
kernel/rand 17157 ns 18642 ns 0.92
array/reverse/1d 20141 ns 19990 ns 1.01
array/reverse/2dL_inplace 66786 ns 66917 ns 1.00
array/reverse/1dL 70266 ns 70158 ns 1.00
array/reverse/2d 22157 ns 21954 ns 1.01
array/reverse/1d_inplace 9691 ns 9677 ns 1.00
array/reverse/2d_inplace 13399 ns 11077 ns 1.21
array/reverse/2dL 74049 ns 74051.5 ns 1.00
array/reverse/1dL_inplace 66880 ns 66880 ns 1
array/copy 20646 ns 20660 ns 1.00
array/iteration/findall/int 159068.5 ns 158373 ns 1.00
array/iteration/findall/bool 140988 ns 140139 ns 1.01
array/iteration/findfirst/int 162401 ns 161271 ns 1.01
array/iteration/findfirst/bool 163334 ns 162049 ns 1.01
array/iteration/scalar 73196.5 ns 72812.5 ns 1.01
array/iteration/logical 217862.5 ns 216894.5 ns 1.00
array/iteration/findmin/1d 77358 ns 50981 ns 1.52
array/iteration/findmin/2d 102774.5 ns 96704 ns 1.06
array/reductions/reduce/Int64/1d 43761 ns 43491 ns 1.01
array/reductions/reduce/Int64/dims=1 44568 ns 52642.5 ns 0.85
array/reductions/reduce/Int64/dims=2 61547 ns 61484 ns 1.00
array/reductions/reduce/Int64/dims=1L 89171 ns 88879 ns 1.00
array/reductions/reduce/Int64/dims=2L 88487 ns 87977 ns 1.01
array/reductions/reduce/Float32/1d 38763 ns 37248.5 ns 1.04
array/reductions/reduce/Float32/dims=1 41850 ns 43278 ns 0.97
array/reductions/reduce/Float32/dims=2 60367 ns 60066 ns 1.01
array/reductions/reduce/Float32/dims=1L 52822 ns 52282 ns 1.01
array/reductions/reduce/Float32/dims=2L 72620 ns 72365.5 ns 1.00
array/reductions/mapreduce/Int64/1d 43468 ns 43561 ns 1.00
array/reductions/mapreduce/Int64/dims=1 47363 ns 44306 ns 1.07
array/reductions/mapreduce/Int64/dims=2 61811 ns 61482 ns 1.01
array/reductions/mapreduce/Int64/dims=1L 89185 ns 89001 ns 1.00
array/reductions/mapreduce/Int64/dims=2L 88305 ns 88320 ns 1.00
array/reductions/mapreduce/Float32/1d 37708 ns 38092.5 ns 0.99
array/reductions/mapreduce/Float32/dims=1 49064 ns 41962 ns 1.17
array/reductions/mapreduce/Float32/dims=2 60059 ns 60039 ns 1.00
array/reductions/mapreduce/Float32/dims=1L 53137 ns 52636 ns 1.01
array/reductions/mapreduce/Float32/dims=2L 72562 ns 72310 ns 1.00
array/broadcast 20263 ns 20127 ns 1.01
array/copyto!/gpu_to_gpu 11223 ns 12738 ns 0.88
array/copyto!/cpu_to_gpu 218116 ns 217857 ns 1.00
array/copyto!/gpu_to_cpu 286004 ns 287088 ns 1.00
array/accumulate/Int64/1d 125173.5 ns 124778 ns 1.00
array/accumulate/Int64/dims=1 84350.5 ns 83708 ns 1.01
array/accumulate/Int64/dims=2 158993 ns 158367 ns 1.00
array/accumulate/Int64/dims=1L 1710838 ns 1710164 ns 1.00
array/accumulate/Int64/dims=2L 967810 ns 967254 ns 1.00
array/accumulate/Float32/1d 109718.5 ns 109314 ns 1.00
array/accumulate/Float32/dims=1 80980 ns 80184 ns 1.01
array/accumulate/Float32/dims=2 148109.5 ns 147922 ns 1.00
array/accumulate/Float32/dims=1L 1619471.5 ns 1618786 ns 1.00
array/accumulate/Float32/dims=2L 699234 ns 698724 ns 1.00
array/construct 1279 ns 1295.5 ns 0.99
array/random/randn/Float32 44345 ns 47861 ns 0.93
array/random/randn!/Float32 25094 ns 24875 ns 1.01
array/random/rand!/Int64 27280 ns 27408 ns 1.00
array/random/rand!/Float32 8877.333333333334 ns 8909.666666666666 ns 1.00
array/random/rand/Int64 30521 ns 30055 ns 1.02
array/random/rand/Float32 13265 ns 13184 ns 1.01
array/permutedims/4d 55546.5 ns 55109 ns 1.01
array/permutedims/2d 54266 ns 53832 ns 1.01
array/permutedims/3d 55007 ns 54841 ns 1.00
array/sorting/1d 2758819.5 ns 2757534 ns 1.00
array/sorting/by 3345798 ns 3344541 ns 1.00
array/sorting/2d 1082192 ns 1081521 ns 1.00
cuda/synchronization/stream/auto 1066 ns 1036.5 ns 1.03
cuda/synchronization/stream/nonblocking 7184.5 ns 7410.8 ns 0.97
cuda/synchronization/stream/blocking 873.92 ns 820.6336633663366 ns 1.06
cuda/synchronization/context/auto 1182.4 ns 1154.3 ns 1.02
cuda/synchronization/context/nonblocking 7116.6 ns 7124.4 ns 1.00
cuda/synchronization/context/blocking 915.2162162162163 ns 887.4107142857143 ns 1.03

This comment was automatically generated by workflow using github-action-benchmark.

@codecov
Copy link

codecov bot commented Jan 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.32%. Comparing base (77191af) to head (040bb83).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #3008       +/-   ##
===========================================
+ Coverage   75.90%   89.32%   +13.42%     
===========================================
  Files         148      148               
  Lines       12902    12991       +89     
===========================================
+ Hits         9793    11604     +1811     
+ Misses       3109     1387     -1722     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maleadt maleadt merged commit 554031a into master Jan 3, 2026
2 checks passed
@maleadt maleadt deleted the tb/cuda_compiler_jll branch January 3, 2026 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants