- 
                Notifications
    You must be signed in to change notification settings 
- Fork 79
KernelIntrinsics #562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KernelIntrinsics #562
Conversation
| Codecov Report❌ Patch coverage is  
 Additional details and impacted files@@          Coverage Diff          @@
##            main    #562   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         21      21           
  Lines       1575    1581    +6     
=====================================
- Misses      1575    1581    +6     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
| Returns the unique local work-item ID. | ||
| """ | ||
| function get_local_id end | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So IIUC, backends should implement these like below, right?
function get_local_id()
    return (threadIdx().x, threadIdx().y, threadIdx().z)
endThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah basically, and my goal is to replace the old internal functions the people had to override with definitions based on these functions.
| 
 This stack of pull requests is managed by Graphite. Learn more about stacking. | 
2d12b29    to
    1298a56      
    Compare
  
    1298a56    to
    1831242      
    Compare
  
    1831242    to
    63244eb      
    Compare
  
    63244eb    to
    af5c7db      
    Compare
  
    af5c7db    to
    ad4c968      
    Compare
  
    | Benchmark Results
 Benchmark PlotsA plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. | 
c51072f    to
    a0e822e      
    Compare
  
    ad4c968    to
    1c8b7e9      
    Compare
  
    1c8b7e9    to
    55dde0c      
    Compare
  
    55dde0c    to
    f1e98cc      
    Compare
  
    | function get_global_size end | ||
|  | ||
| """ | ||
| get_global_id()::@NamedTuple{x::Int32, y::Int32, z::Int32} | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be Int32 or Int64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenCL defines these as Csize_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to allow the platform to define the specific <:Integer index type? Metal uses uint3 by default, which are three UInt32 values. I liked that CUDA Thrust allowed the indices to be templated, so I could use Int64 only when dealing with billions of datapoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GPU dev call: We should make these Int/size_t
| So the idea is to decouple the back-ends from KA.jl, instead implementing KernelIntrinsics.jl? What's the advantage; do you envision packages other than KA.jl to build their kernel DSL on top of KernelIntrinsics.jl? | 
f1e98cc    to
    6841207      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions could not be made:
- src/pocl/nanoOpenCL.jl
- lines 670-674
 
e91b33b    to
    7a5e159      
    Compare
  
    7a5e159    to
    0a8301c      
    Compare
  
    777c099    to
    3bb80ac      
    Compare
  
    0d72b34    to
    1f3b249      
    Compare
  
    1f3b249    to
    a0fbc48      
    Compare
  
    a0fbc48    to
    7f02b2e      
    Compare
  
    7f02b2e    to
    bb0c213      
    Compare
  
    bb0c213    to
    16524fd      
    Compare
  
    ed2ee63    to
    9741962      
    Compare
  
    16524fd    to
    8ea6ad2      
    Compare
  
    | !!! note | ||
| 1-based. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure sure I understand what a 1-based size means, is this supposed to be in the _id() docstrings instead?
| Replaced by #635 | 

The goal is to allow for kernels to be written without relying on KernelAbstractions macros
cc: @maleadt @pxl-th