- 
                Notifications
    You must be signed in to change notification settings 
- Fork 46
Add error checking to command buffer completion handler #521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Your PR no longer requires formatting changes. Thank you for your contribution! | 
| Codecov ReportAttention: Patch coverage is  
 
 Additional details and impacted files@@            Coverage Diff             @@
##             main     #521      +/-   ##
==========================================
+ Coverage   71.04%   75.32%   +4.28%     
==========================================
  Files          36       57      +21     
  Lines        1143     2703    +1560     
==========================================
+ Hits          812     2036    +1224     
- Misses        331      667     +336     ☔ View full report in Codecov by Sentry. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metal Benchmarks
| Benchmark suite | Current: 3c9c87a | Previous: 1b811cb | Ratio | 
|---|---|---|---|
| private array/construct | 25461.833333333332ns | 26482.666666666668ns | 0.96 | 
| private array/broadcast | 462917ns | 457584ns | 1.01 | 
| private array/random/randn/Float32 | 799458.5ns | 857895.5ns | 0.93 | 
| private array/random/randn!/Float32 | 634083ns | 657709ns | 0.96 | 
| private array/random/rand!/Int64 | 567062.5ns | 564458ns | 1.00 | 
| private array/random/rand!/Float32 | 591791ns | 610416ns | 0.97 | 
| private array/random/rand/Int64 | 769520.5ns | 805354ns | 0.96 | 
| private array/random/rand/Float32 | 607271ns | 602791.5ns | 1.01 | 
| private array/copyto!/gpu_to_gpu | 666833ns | 672437.5ns | 0.99 | 
| private array/copyto!/cpu_to_gpu | 814833ns | 635271ns | 1.28 | 
| private array/copyto!/gpu_to_cpu | 613375ns | 815104.5ns | 0.75 | 
| private array/accumulate/1d | 1327375ns | 1324625ns | 1.00 | 
| private array/accumulate/2d | 1380978.5ns | 1393666.5ns | 0.99 | 
| private array/iteration/findall/int | 2109104.5ns | 2070667ns | 1.02 | 
| private array/iteration/findall/bool | 1803854ns | 1827208ns | 0.99 | 
| private array/iteration/findfirst/int | 1682875ns | 1688333.5ns | 1.00 | 
| private array/iteration/findfirst/bool | 1665584ns | 1671750ns | 1.00 | 
| private array/iteration/scalar | 3796333ns | 3643750ns | 1.04 | 
| private array/iteration/logical | 3186292ns | 3166208ns | 1.01 | 
| private array/iteration/findmin/1d | 1755604ns | 1758542ns | 1.00 | 
| private array/iteration/findmin/2d | 1344792ns | 1357792ns | 0.99 | 
| private array/reductions/reduce/1d | 1026604.5ns | 1043270.5ns | 0.98 | 
| private array/reductions/reduce/2d | 659417ns | 663333ns | 0.99 | 
| private array/reductions/mapreduce/1d | 1026395.5ns | 1043084ns | 0.98 | 
| private array/reductions/mapreduce/2d | 662709ns | 665896ns | 1.00 | 
| private array/permutedims/4d | 2528334ns | 2529437.5ns | 1.00 | 
| private array/permutedims/2d | 1016208ns | 1024875ns | 0.99 | 
| private array/permutedims/3d | 1575021ns | 1585208ns | 0.99 | 
| private array/copy | 590749.5ns | 592958ns | 1.00 | 
| latency/precompile | 5911672333ns | 8799224667ns | 0.67 | 
| latency/ttfp | 3460967895.5ns | 3600655125ns | 0.96 | 
| latency/import | 1125913500ns | 1231127083ns | 0.91 | 
| integration/metaldevrt | 710167ns | 701416ns | 1.01 | 
| integration/byval/slices=1 | 1564562.5ns | 1566354ns | 1.00 | 
| integration/byval/slices=3 | 8508104ns | 10376042ns | 0.82 | 
| integration/byval/reference | 1557000ns | 1610875ns | 0.97 | 
| integration/byval/slices=2 | 2533167ns | 2715041ns | 0.93 | 
| kernel/indexing | 482750ns | 474291.5ns | 1.02 | 
| kernel/indexing_checked | 481667ns | 475895.5ns | 1.01 | 
| kernel/launch | 37073ns | 8208ns | 4.52 | 
| metal/synchronization/stream | 14500ns | 15041ns | 0.96 | 
| metal/synchronization/context | 14667ns | 15000ns | 0.98 | 
| shared array/construct | 24409.666666666668ns | 24145.833333333332ns | 1.01 | 
| shared array/broadcast | 460375ns | 461145.5ns | 1.00 | 
| shared array/random/randn/Float32 | 812875.5ns | 813750ns | 1.00 | 
| shared array/random/randn!/Float32 | 635958ns | 636542ns | 1.00 | 
| shared array/random/rand!/Int64 | 562771ns | 568417ns | 0.99 | 
| shared array/random/rand!/Float32 | 595292ns | 603334ns | 0.99 | 
| shared array/random/rand/Int64 | 795542ns | 778417ns | 1.02 | 
| shared array/random/rand/Float32 | 605042ns | 616250ns | 0.98 | 
| shared array/copyto!/gpu_to_gpu | 81125ns | 84667ns | 0.96 | 
| shared array/copyto!/cpu_to_gpu | 82958ns | 83375ns | 0.99 | 
| shared array/copyto!/gpu_to_cpu | 81958ns | 84625ns | 0.97 | 
| shared array/accumulate/1d | 1346708ns | 1346167ns | 1.00 | 
| shared array/accumulate/2d | 1390833ns | 1397291.5ns | 1.00 | 
| shared array/iteration/findall/int | 1807667ns | 1800250ns | 1.00 | 
| shared array/iteration/findall/bool | 1592292ns | 1590437.5ns | 1.00 | 
| shared array/iteration/findfirst/int | 1381042ns | 1408125ns | 0.98 | 
| shared array/iteration/findfirst/bool | 1366334ns | 1364584ns | 1.00 | 
| shared array/iteration/scalar | 156375ns | 153583ns | 1.02 | 
| shared array/iteration/logical | 2954791.5ns | 2981833.5ns | 0.99 | 
| shared array/iteration/findmin/1d | 1466375ns | 1467625ns | 1.00 | 
| shared array/iteration/findmin/2d | 1365375ns | 1369416ns | 1.00 | 
| shared array/reductions/reduce/1d | 727958ns | 738166.5ns | 0.99 | 
| shared array/reductions/reduce/2d | 667750ns | 663208.5ns | 1.01 | 
| shared array/reductions/mapreduce/1d | 738083ns | 746583ns | 0.99 | 
| shared array/reductions/mapreduce/2d | 658333ns | 670917ns | 0.98 | 
| shared array/permutedims/4d | 2551125ns | 2524958.5ns | 1.01 | 
| shared array/permutedims/2d | 1017458ns | 1028125ns | 0.99 | 
| shared array/permutedims/3d | 1583584ns | 1588833ns | 1.00 | 
| shared array/copy | 246625ns | 239042ns | 1.03 | 
This comment was automatically generated by workflow using github-action-benchmark.
| Thank you! Once you add tests and fix formatting this should be good to merge. | 
| i cant figure out how to get tests working, any ideas how do i solve the err? | 
| I believe your illegal load gets optimized away and thus never triggers the error. | 
| hmm what can i try to do then? | 
| Have you looked at #416 and the links to the resources in #510? They might point you in the right direction. As for the kernel, you could write one that tries to set the value in clearly out of bounds memory (like  | 
| This will need some thinking; apparently we can't switch tasks on the callback threads or risk a deadlock. Main thread, waits until everything is done: Callback thread, triggers a task switch by doing  JuliaInterop/ObjectiveC.jl#43 made it possible to allocate, but I guess task switches need additional work. | 
| Also, even switching this to a proper null pointer load doesn't trigger a command buffer error: I would have expected this to trigger a  | 
| i experimented with aggressive OOB accesses, pointer dereferences via  I'm happy to implement to test any other specific approaches you recommend ? | 
| I'm not sure of any test that triggers an error here, but regardless it's probably good to have the reporting in place. Because of the task switching issue mentioned above, we'll have to stick to  | 
| Agreed, it's better to have the reporting in place as a safeguard. And yes, sticking to Core.println makes sense given the task switching issue. | 
| what could be the next move ? | 
| Let's simplify for now and create an issue to track proper error reporting. | 
solves #510