Skip to content

Conversation

@christiangnrd
Copy link
Member

No description provided.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic main) to apply these changes.

Click here to view the suggested changes.
diff --git a/lib/mtl/capture.jl b/lib/mtl/capture.jl
index c2c1a77a..c101c5b7 100644
--- a/lib/mtl/capture.jl
+++ b/lib/mtl/capture.jl
@@ -59,7 +59,8 @@ function MTLCaptureDescriptor()
 end
 
 # TODO: Add capture state
-function MTLCaptureDescriptor(obj::Union{MTLDevice,MTLCommandQueue,MTLCaptureScope},
+function MTLCaptureDescriptor(
+        obj::Union{MTLDevice, MTLCommandQueue, MTLCaptureScope},
                               destination::MTLCaptureDestination;
                               folder::String=nothing)
     desc = MTLCaptureDescriptor()
@@ -110,7 +111,8 @@ end
 
 Start GPU frame capture using the default capture object and specifying capture descriptor parameters directly.
 """
-function startCapture(obj::Union{MTLDevice,MTLCommandQueue,MTLCaptureScope},
+function startCapture(
+        obj::Union{MTLDevice, MTLCommandQueue, MTLCaptureScope},
                       destination::MTLCaptureDestination=MTLCaptureDestinationGPUTraceDocument;
                       folder::String=nothing)
     if destination == MTLCaptureDestinationGPUTraceDocument && folder === nothing
diff --git a/perf/array.jl b/perf/array.jl
index 008ab4d6..b86a675e 100644
--- a/perf/array.jl
+++ b/perf/array.jl
@@ -63,12 +63,12 @@ gpu_vec_ints = reshape(gpu_mat_ints, length(gpu_mat_ints))
 let group = addgroup!(group, "reverse")
     group["1d"] = @benchmarkable Metal.@sync reverse($gpu_vec)
     group["1dL"] = @benchmarkable Metal.@sync reverse($gpu_vec_long)
-    group["2d"] = @benchmarkable Metal.@sync reverse($gpu_mat; dims=1)
-    group["2dL"] = @benchmarkable Metal.@sync reverse($gpu_mat_long; dims=1)
+    group["2d"] = @benchmarkable Metal.@sync reverse($gpu_mat; dims = 1)
+    group["2dL"] = @benchmarkable Metal.@sync reverse($gpu_mat_long; dims = 1)
     group["1d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_vec)
     group["1dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_vec_long)
-    group["2d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat; dims=1)
-    group["2dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat_long; dims=2)
+    group["2d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat; dims = 1)
+    group["2dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat_long; dims = 2)
 end
 
 # 'evals=1' added to prevent hang when running benchmarks of CI
diff --git a/perf/runbenchmarks.jl b/perf/runbenchmarks.jl
index 17bf4ea0..98aa3153 100644
--- a/perf/runbenchmarks.jl
+++ b/perf/runbenchmarks.jl
@@ -1,7 +1,7 @@
 # benchmark suite execution and codespeed submission
 
 using Pkg
-Pkg.add(url="https://github.com/christiangnrd/GPUArrays.jl", rev="reverse")
+Pkg.add(url = "https://github.com/christiangnrd/GPUArrays.jl", rev = "reverse")
 
 using Metal
 
diff --git a/test/runtests.jl b/test/runtests.jl
index 081fc280..42f00908 100644
--- a/test/runtests.jl
+++ b/test/runtests.jl
@@ -1,5 +1,5 @@
 using Pkg
-Pkg.add(url="https://github.com/christiangnrd/GPUArrays.jl", rev="reverse")
+Pkg.add(url = "https://github.com/christiangnrd/GPUArrays.jl", rev = "reverse")
 
 using Distributed
 using Dates

@christiangnrd christiangnrd force-pushed the reverse branch 2 times, most recently from 4c15cc1 to 108f6d1 Compare August 5, 2025 01:24
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: e3ce3ae Previous: 18d5d95 Ratio
latency/precompile 10708835208.5 ns 10738355812.5 ns 1.00
latency/ttfp 5088232041.5 ns 5093095000 ns 1.00
latency/import 1309493875 ns 1307420042 ns 1.00
integration/metaldevrt 961416 ns 916521 ns 1.05
integration/byval/slices=1 1650541 ns 1655958 ns 1.00
integration/byval/slices=3 9014521 ns 8745792 ns 1.03
integration/byval/reference 1635625 ns 1624791 ns 1.01
integration/byval/slices=2 2705666 ns 2721500 ns 0.99
kernel/indexing 697500 ns 696291 ns 1.00
kernel/indexing_checked 701625 ns 696584 ns 1.01
kernel/launch 14208 ns 12416 ns 1.14
array/reverse/1d 671833.5 ns
array/reverse/2dL_inplace 3033708.5 ns
array/reverse/1dL 2367708 ns
array/reverse/2d 1462187.5 ns
array/reverse/1d_inplace 728917 ns
array/reverse/2d_inplace 931084 ns
array/reverse/2dL 6684125 ns
array/reverse/1dL_inplace 1072542 ns
array/construct 6167 ns 5792 ns 1.06
array/broadcast 681958 ns 665584 ns 1.02
array/accumulate/Int64/1d 1379520.5 ns 1360750 ns 1.01
array/accumulate/Int64/dims=1 1926125 ns 1916333 ns 1.01
array/accumulate/Int64/dims=2 2291958 ns 2278146 ns 1.01
array/accumulate/Int64/dims=1L 11902291 ns 12001125 ns 0.99
array/accumulate/Int64/dims=2L 9814458 ns 9901666 ns 0.99
array/accumulate/Float32/1d 1268500 ns 1245417 ns 1.02
array/accumulate/Float32/dims=1 1676771 ns 1669792 ns 1.00
array/accumulate/Float32/dims=2 2014833.5 ns 2007458 ns 1.00
array/accumulate/Float32/dims=1L 10007083 ns 9976625 ns 1.00
array/accumulate/Float32/dims=2L 7379750.5 ns 7388583.5 ns 1.00
array/random/randn/Float32 816333 ns 864875 ns 0.94
array/random/randn!/Float32 652459 ns 625250 ns 1.04
array/random/rand!/Int64 578437.5 ns 565916.5 ns 1.02
array/random/rand!/Float32 607375 ns 583083 ns 1.04
array/random/rand/Int64 780084 ns 729917 ns 1.07
array/random/rand/Float32 613833 ns 597917 ns 1.03
array/reductions/reduce/Int64/1d 1383125 ns 1339354 ns 1.03
array/reductions/reduce/Int64/dims=1 1167479.5 ns 1166145.5 ns 1.00
array/reductions/reduce/Int64/dims=2 1331229.5 ns 1307792 ns 1.02
array/reductions/reduce/Int64/dims=1L 2071458 ns 2095291 ns 0.99
array/reductions/reduce/Int64/dims=2L 3649895.5 ns 3597937.5 ns 1.01
array/reductions/reduce/Float32/1d 1093458.5 ns 986792 ns 1.11
array/reductions/reduce/Float32/dims=1 907417 ns 909041.5 ns 1.00
array/reductions/reduce/Float32/dims=2 788458 ns 770916 ns 1.02
array/reductions/reduce/Float32/dims=1L 1415375 ns 1410916.5 ns 1.00
array/reductions/reduce/Float32/dims=2L 1933916 ns 1933084 ns 1.00
array/reductions/mapreduce/Int64/1d 1433562.5 ns 1350250 ns 1.06
array/reductions/mapreduce/Int64/dims=1 1173875 ns 1211709 ns 0.97
array/reductions/mapreduce/Int64/dims=2 1341708 ns 1314292 ns 1.02
array/reductions/mapreduce/Int64/dims=1L 2114542 ns 2102020.5 ns 1.01
array/reductions/mapreduce/Int64/dims=2L 3624312.5 ns 3605750 ns 1.01
array/reductions/mapreduce/Float32/1d 1063146 ns 1059291 ns 1.00
array/reductions/mapreduce/Float32/dims=1 907792 ns 901500 ns 1.01
array/reductions/mapreduce/Float32/dims=2 787604.5 ns 777833 ns 1.01
array/reductions/mapreduce/Float32/dims=1L 1410687.5 ns 1401750 ns 1.01
array/reductions/mapreduce/Float32/dims=2L 1955167 ns 1936959 ns 1.01
array/private/copyto!/gpu_to_gpu 681625 ns 660167 ns 1.03
array/private/copyto!/cpu_to_gpu 827916 ns 805666 ns 1.03
array/private/copyto!/gpu_to_cpu 833833 ns 827000 ns 1.01
array/private/iteration/findall/int 1699916 ns 1676666 ns 1.01
array/private/iteration/findall/bool 1494563 ns 1481375 ns 1.01
array/private/iteration/findfirst/int 2047687.5 ns 2050625 ns 1.00
array/private/iteration/findfirst/bool 1953937.5 ns 1853416.5 ns 1.05
array/private/iteration/scalar 5613812.5 ns 3817291 ns 1.47
array/private/iteration/logical 2852041 ns 2757458 ns 1.03
array/private/iteration/findmin/1d 2056708 ns 2049500 ns 1.00
array/private/iteration/findmin/2d 1638417 ns 1633375 ns 1.00
array/private/copy 560834 ns 561500 ns 1.00
array/shared/copyto!/gpu_to_gpu 83875 ns 84416 ns 0.99
array/shared/copyto!/cpu_to_gpu 83250 ns 82541 ns 1.01
array/shared/copyto!/gpu_to_cpu 89958 ns 83917 ns 1.07
array/shared/iteration/findall/int 1677542 ns 1689292 ns 0.99
array/shared/iteration/findall/bool 1406750 ns 1503229 ns 0.94
array/shared/iteration/findfirst/int 1457917 ns 1457104.5 ns 1.00
array/shared/iteration/findfirst/bool 1439916 ns 1442750 ns 1.00
array/shared/iteration/scalar 161041 ns 155375 ns 1.04
array/shared/iteration/logical 2408104 ns 2449333.5 ns 0.98
array/shared/iteration/findmin/1d 1581708 ns 1513312.5 ns 1.05
array/shared/iteration/findmin/2d 1646458 ns 1636250 ns 1.01
array/shared/copy 243917 ns 257896 ns 0.95
array/permutedims/4d 2558750 ns 2525958 ns 1.01
array/permutedims/2d 1293708 ns 1278583 ns 1.01
array/permutedims/3d 1853417 ns 1824583.5 ns 1.02
metal/synchronization/stream 15084 ns 14541 ns 1.04
metal/synchronization/context 15417 ns 15042 ns 1.02

This comment was automatically generated by workflow using github-action-benchmark.

@codecov
Copy link

codecov bot commented Oct 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.53%. Comparing base (18d5d95) to head (e3ce3ae).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #648   +/-   ##
=======================================
  Coverage   80.53%   80.53%           
=======================================
  Files          61       61           
  Lines        2779     2779           
=======================================
  Hits         2238     2238           
  Misses        541      541           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maleadt maleadt marked this pull request as draft October 14, 2025 08:00
@maleadt
Copy link
Member

maleadt commented Oct 14, 2025

Let's mark this as draft until it pulls from a dev branch on GPUArrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants