- 
                Notifications
    You must be signed in to change notification settings 
- Fork 46
          Test GPUArrays reverse
          #648
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/lib/mtl/capture.jl b/lib/mtl/capture.jl
index c2c1a77a..c101c5b7 100644
--- a/lib/mtl/capture.jl
+++ b/lib/mtl/capture.jl
@@ -59,7 +59,8 @@ function MTLCaptureDescriptor()
 end
 
 # TODO: Add capture state
-function MTLCaptureDescriptor(obj::Union{MTLDevice,MTLCommandQueue,MTLCaptureScope},
+function MTLCaptureDescriptor(
+        obj::Union{MTLDevice, MTLCommandQueue, MTLCaptureScope},
                               destination::MTLCaptureDestination;
                               folder::String=nothing)
     desc = MTLCaptureDescriptor()
@@ -110,7 +111,8 @@ end
 
 Start GPU frame capture using the default capture object and specifying capture descriptor parameters directly.
 """
-function startCapture(obj::Union{MTLDevice,MTLCommandQueue,MTLCaptureScope},
+function startCapture(
+        obj::Union{MTLDevice, MTLCommandQueue, MTLCaptureScope},
                       destination::MTLCaptureDestination=MTLCaptureDestinationGPUTraceDocument;
                       folder::String=nothing)
     if destination == MTLCaptureDestinationGPUTraceDocument && folder === nothing
diff --git a/perf/array.jl b/perf/array.jl
index 008ab4d6..b86a675e 100644
--- a/perf/array.jl
+++ b/perf/array.jl
@@ -63,12 +63,12 @@ gpu_vec_ints = reshape(gpu_mat_ints, length(gpu_mat_ints))
 let group = addgroup!(group, "reverse")
     group["1d"] = @benchmarkable Metal.@sync reverse($gpu_vec)
     group["1dL"] = @benchmarkable Metal.@sync reverse($gpu_vec_long)
-    group["2d"] = @benchmarkable Metal.@sync reverse($gpu_mat; dims=1)
-    group["2dL"] = @benchmarkable Metal.@sync reverse($gpu_mat_long; dims=1)
+    group["2d"] = @benchmarkable Metal.@sync reverse($gpu_mat; dims = 1)
+    group["2dL"] = @benchmarkable Metal.@sync reverse($gpu_mat_long; dims = 1)
     group["1d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_vec)
     group["1dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_vec_long)
-    group["2d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat; dims=1)
-    group["2dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat_long; dims=2)
+    group["2d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat; dims = 1)
+    group["2dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat_long; dims = 2)
 end
 
 # 'evals=1' added to prevent hang when running benchmarks of CI
diff --git a/perf/runbenchmarks.jl b/perf/runbenchmarks.jl
index 17bf4ea0..98aa3153 100644
--- a/perf/runbenchmarks.jl
+++ b/perf/runbenchmarks.jl
@@ -1,7 +1,7 @@
 # benchmark suite execution and codespeed submission
 
 using Pkg
-Pkg.add(url="https://github.com/christiangnrd/GPUArrays.jl", rev="reverse")
+Pkg.add(url = "https://github.com/christiangnrd/GPUArrays.jl", rev = "reverse")
 
 using Metal
 
diff --git a/test/runtests.jl b/test/runtests.jl
index 081fc280..42f00908 100644
--- a/test/runtests.jl
+++ b/test/runtests.jl
@@ -1,5 +1,5 @@
 using Pkg
-Pkg.add(url="https://github.com/christiangnrd/GPUArrays.jl", rev="reverse")
+Pkg.add(url = "https://github.com/christiangnrd/GPUArrays.jl", rev = "reverse")
 
 using Distributed
 using Dates | 
4c15cc1    to
    108f6d1      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metal Benchmarks
| Benchmark suite | Current: e3ce3ae | Previous: 18d5d95 | Ratio | 
|---|---|---|---|
| latency/precompile | 10708835208.5ns | 10738355812.5ns | 1.00 | 
| latency/ttfp | 5088232041.5ns | 5093095000ns | 1.00 | 
| latency/import | 1309493875ns | 1307420042ns | 1.00 | 
| integration/metaldevrt | 961416ns | 916521ns | 1.05 | 
| integration/byval/slices=1 | 1650541ns | 1655958ns | 1.00 | 
| integration/byval/slices=3 | 9014521ns | 8745792ns | 1.03 | 
| integration/byval/reference | 1635625ns | 1624791ns | 1.01 | 
| integration/byval/slices=2 | 2705666ns | 2721500ns | 0.99 | 
| kernel/indexing | 697500ns | 696291ns | 1.00 | 
| kernel/indexing_checked | 701625ns | 696584ns | 1.01 | 
| kernel/launch | 14208ns | 12416ns | 1.14 | 
| array/reverse/1d | 671833.5ns | ||
| array/reverse/2dL_inplace | 3033708.5ns | ||
| array/reverse/1dL | 2367708ns | ||
| array/reverse/2d | 1462187.5ns | ||
| array/reverse/1d_inplace | 728917ns | ||
| array/reverse/2d_inplace | 931084ns | ||
| array/reverse/2dL | 6684125ns | ||
| array/reverse/1dL_inplace | 1072542ns | ||
| array/construct | 6167ns | 5792ns | 1.06 | 
| array/broadcast | 681958ns | 665584ns | 1.02 | 
| array/accumulate/Int64/1d | 1379520.5ns | 1360750ns | 1.01 | 
| array/accumulate/Int64/dims=1 | 1926125ns | 1916333ns | 1.01 | 
| array/accumulate/Int64/dims=2 | 2291958ns | 2278146ns | 1.01 | 
| array/accumulate/Int64/dims=1L | 11902291ns | 12001125ns | 0.99 | 
| array/accumulate/Int64/dims=2L | 9814458ns | 9901666ns | 0.99 | 
| array/accumulate/Float32/1d | 1268500ns | 1245417ns | 1.02 | 
| array/accumulate/Float32/dims=1 | 1676771ns | 1669792ns | 1.00 | 
| array/accumulate/Float32/dims=2 | 2014833.5ns | 2007458ns | 1.00 | 
| array/accumulate/Float32/dims=1L | 10007083ns | 9976625ns | 1.00 | 
| array/accumulate/Float32/dims=2L | 7379750.5ns | 7388583.5ns | 1.00 | 
| array/random/randn/Float32 | 816333ns | 864875ns | 0.94 | 
| array/random/randn!/Float32 | 652459ns | 625250ns | 1.04 | 
| array/random/rand!/Int64 | 578437.5ns | 565916.5ns | 1.02 | 
| array/random/rand!/Float32 | 607375ns | 583083ns | 1.04 | 
| array/random/rand/Int64 | 780084ns | 729917ns | 1.07 | 
| array/random/rand/Float32 | 613833ns | 597917ns | 1.03 | 
| array/reductions/reduce/Int64/1d | 1383125ns | 1339354ns | 1.03 | 
| array/reductions/reduce/Int64/dims=1 | 1167479.5ns | 1166145.5ns | 1.00 | 
| array/reductions/reduce/Int64/dims=2 | 1331229.5ns | 1307792ns | 1.02 | 
| array/reductions/reduce/Int64/dims=1L | 2071458ns | 2095291ns | 0.99 | 
| array/reductions/reduce/Int64/dims=2L | 3649895.5ns | 3597937.5ns | 1.01 | 
| array/reductions/reduce/Float32/1d | 1093458.5ns | 986792ns | 1.11 | 
| array/reductions/reduce/Float32/dims=1 | 907417ns | 909041.5ns | 1.00 | 
| array/reductions/reduce/Float32/dims=2 | 788458ns | 770916ns | 1.02 | 
| array/reductions/reduce/Float32/dims=1L | 1415375ns | 1410916.5ns | 1.00 | 
| array/reductions/reduce/Float32/dims=2L | 1933916ns | 1933084ns | 1.00 | 
| array/reductions/mapreduce/Int64/1d | 1433562.5ns | 1350250ns | 1.06 | 
| array/reductions/mapreduce/Int64/dims=1 | 1173875ns | 1211709ns | 0.97 | 
| array/reductions/mapreduce/Int64/dims=2 | 1341708ns | 1314292ns | 1.02 | 
| array/reductions/mapreduce/Int64/dims=1L | 2114542ns | 2102020.5ns | 1.01 | 
| array/reductions/mapreduce/Int64/dims=2L | 3624312.5ns | 3605750ns | 1.01 | 
| array/reductions/mapreduce/Float32/1d | 1063146ns | 1059291ns | 1.00 | 
| array/reductions/mapreduce/Float32/dims=1 | 907792ns | 901500ns | 1.01 | 
| array/reductions/mapreduce/Float32/dims=2 | 787604.5ns | 777833ns | 1.01 | 
| array/reductions/mapreduce/Float32/dims=1L | 1410687.5ns | 1401750ns | 1.01 | 
| array/reductions/mapreduce/Float32/dims=2L | 1955167ns | 1936959ns | 1.01 | 
| array/private/copyto!/gpu_to_gpu | 681625ns | 660167ns | 1.03 | 
| array/private/copyto!/cpu_to_gpu | 827916ns | 805666ns | 1.03 | 
| array/private/copyto!/gpu_to_cpu | 833833ns | 827000ns | 1.01 | 
| array/private/iteration/findall/int | 1699916ns | 1676666ns | 1.01 | 
| array/private/iteration/findall/bool | 1494563ns | 1481375ns | 1.01 | 
| array/private/iteration/findfirst/int | 2047687.5ns | 2050625ns | 1.00 | 
| array/private/iteration/findfirst/bool | 1953937.5ns | 1853416.5ns | 1.05 | 
| array/private/iteration/scalar | 5613812.5ns | 3817291ns | 1.47 | 
| array/private/iteration/logical | 2852041ns | 2757458ns | 1.03 | 
| array/private/iteration/findmin/1d | 2056708ns | 2049500ns | 1.00 | 
| array/private/iteration/findmin/2d | 1638417ns | 1633375ns | 1.00 | 
| array/private/copy | 560834ns | 561500ns | 1.00 | 
| array/shared/copyto!/gpu_to_gpu | 83875ns | 84416ns | 0.99 | 
| array/shared/copyto!/cpu_to_gpu | 83250ns | 82541ns | 1.01 | 
| array/shared/copyto!/gpu_to_cpu | 89958ns | 83917ns | 1.07 | 
| array/shared/iteration/findall/int | 1677542ns | 1689292ns | 0.99 | 
| array/shared/iteration/findall/bool | 1406750ns | 1503229ns | 0.94 | 
| array/shared/iteration/findfirst/int | 1457917ns | 1457104.5ns | 1.00 | 
| array/shared/iteration/findfirst/bool | 1439916ns | 1442750ns | 1.00 | 
| array/shared/iteration/scalar | 161041ns | 155375ns | 1.04 | 
| array/shared/iteration/logical | 2408104ns | 2449333.5ns | 0.98 | 
| array/shared/iteration/findmin/1d | 1581708ns | 1513312.5ns | 1.05 | 
| array/shared/iteration/findmin/2d | 1646458ns | 1636250ns | 1.01 | 
| array/shared/copy | 243917ns | 257896ns | 0.95 | 
| array/permutedims/4d | 2558750ns | 2525958ns | 1.01 | 
| array/permutedims/2d | 1293708ns | 1278583ns | 1.01 | 
| array/permutedims/3d | 1853417ns | 1824583.5ns | 1.02 | 
| metal/synchronization/stream | 15084ns | 14541ns | 1.04 | 
| metal/synchronization/context | 15417ns | 15042ns | 1.02 | 
This comment was automatically generated by workflow using github-action-benchmark.
8cde64c    to
    53a0c88      
    Compare
  
    c1d78e5    to
    543c8ee      
    Compare
  
    | Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@           Coverage Diff           @@
##             main     #648   +/-   ##
=======================================
  Coverage   80.53%   80.53%           
=======================================
  Files          61       61           
  Lines        2779     2779           
=======================================
  Hits         2238     2238           
  Misses        541      541           ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
| Let's mark this as draft until it pulls from a dev branch on GPUArrays. | 
No description provided.