-
Notifications
You must be signed in to change notification settings - Fork 109
Most non-transform operators working with JIT #1094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/build |
Greptile Summary
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant CUDAJITExecutor
participant CapabilitySystem
participant KernelProvider
participant NVRTC
participant GPU
User->>CUDAJITExecutor: Exec(op)
CUDAJITExecutor->>CapabilitySystem: Check SUPPORTS_JIT capability
CapabilitySystem-->>CUDAJITExecutor: JIT supported
CUDAJITExecutor->>CapabilitySystem: Query GLOBAL_KERNEL capability
CapabilitySystem-->>CUDAJITExecutor: global_kernel flag
CUDAJITExecutor->>KernelProvider: create_kernel_provider(sizes, jit=true, global_kernel)
CUDAJITExecutor->>KernelProvider: find_best_launch_params(op, kernel_provider)
KernelProvider-->>CUDAJITExecutor: ept, shm_size, block_size, groups_per_block
CUDAJITExecutor->>CUDAJITExecutor: get_grid_dims(blocks, threads, sizes, ept)
CUDAJITExecutor->>NVRTC: nvrtc_compile_and_run(op, sizes, blocks, threads, ept, stride, shm_size)
NVRTC->>GPU: Launch JIT-compiled kernel
GPU-->>User: Results
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
91 files reviewed, 1 comment
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
ac0f1a2 to
7fad6f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
92 files reviewed, no comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
Critical loop bug fixed. Please look again |
7fad6f1 to
ce9d5cb
Compare
|
Skipped: This PR changes more files than the configured file change limit: ( |
|
/build |
ce9d5cb to
b7239d9
Compare
|
/build |
62bdbb4 to
dc08be7
Compare
dc08be7 to
8b9ae4b
Compare
|
/build |
Converted most of the element-wise operators to have JIT support by adding all the capabilities and strings. ND operators still not working yet and need some investigation.
Other changes
GLOBAL_KERNELcapability that says whether it can operate as separate CTAs or notMATX_EN_JITis enabled along with the parameter sweeps. This can take a very long time and we'll have to figure out when to enable this.