Skip to content

Commit fec9330

Browse files
smazouz42EmilyBournebauom
committed
Add support for kernels (#42)
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <[email protected]> Co-authored-by: bauom <[email protected]> Co-authored-by: Emily Bourne <[email protected]>
1 parent 1c8e216 commit fec9330

File tree

19 files changed

+599
-9
lines changed

19 files changed

+599
-9
lines changed

.dict_custom.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,3 +120,4 @@ indexable
120120
traceback
121121
STC
122122
gFTL
123+
GPUs

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ All notable changes to this project will be documented in this file.
77

88
- #32 : Add support for `nvcc` Compiler and `cuda` language as a possible option.
99
- #48 : Fix incorrect handling of imports in `cuda`.
10+
- #42 : Add support for custom kernel in`cuda`.
11+
- #42 : Add Cuda module to Pyccel. Add support for `cuda.synchronize` function.
1012

1113
## \[UNRELEASED\]
1214

docs/cuda.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Getting started GPU
2+
3+
Pyccel now supports NVIDIA CUDA, empowering users to accelerate numerical computations on GPUs seamlessly. With Pyccel's high-level syntax and automatic code generation, harnessing the power of CUDA becomes effortless. This documentation provides a quick guide to enabling CUDA in Pyccel
4+
5+
## Cuda Decorator
6+
7+
### kernel
8+
9+
The kernel decorator allows the user to declare a CUDA kernel. The kernel can be defined in Python, and the syntax is similar to that of Numba.
10+
11+
```python
12+
from pyccel.decorators import kernel
13+
14+
@kernel
15+
def my_kernel():
16+
pass
17+
18+
blockspergrid = 1
19+
threadsperblock = 1
20+
# Call your kernel function
21+
my_kernel[blockspergrid, threadsperblock]()
22+
23+
```

pyccel/ast/core.py

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@
7373
'If',
7474
'IfSection',
7575
'Import',
76+
'IndexedFunctionCall',
7677
'InProgram',
7778
'InlineFunctionDef',
7879
'Interface',
@@ -2075,6 +2076,42 @@ def _ignore(cls, c):
20752076
"""
20762077
return c is None or isinstance(c, (FunctionDef, *cls._ignored_types))
20772078

2079+
class IndexedFunctionCall(FunctionCall):
2080+
"""
2081+
Represents an indexed function call in the code.
2082+
2083+
Class representing indexed function calls, encapsulating all
2084+
relevant information for such calls within the code base.
2085+
2086+
Parameters
2087+
----------
2088+
func : FunctionDef
2089+
The function being called.
2090+
2091+
args : iterable of FunctionCallArgument
2092+
The arguments passed to the function.
2093+
2094+
indexes : iterable of TypedAstNode
2095+
The indexes of the function call.
2096+
2097+
current_function : FunctionDef, optional
2098+
The function where the call takes place.
2099+
"""
2100+
__slots__ = ('_indexes',)
2101+
_attribute_nodes = FunctionCall._attribute_nodes + ('_indexes',)
2102+
def __init__(self, func, args, indexes, current_function = None):
2103+
self._indexes = indexes
2104+
super().__init__(func, args, current_function)
2105+
2106+
@property
2107+
def indexes(self):
2108+
"""
2109+
Indexes of function call.
2110+
2111+
Represents the indexes of the function call
2112+
"""
2113+
return self._indexes
2114+
20782115
class ConstructorCall(FunctionCall):
20792116

20802117
"""

pyccel/ast/cuda.py

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# -*- coding: utf-8 -*-
2+
#------------------------------------------------------------------------------------------#
3+
# This file is part of Pyccel which is released under MIT License. See the LICENSE file or #
4+
# go to https://github.com/pyccel/pyccel/blob/master/LICENSE for full license details. #
5+
#------------------------------------------------------------------------------------------#
6+
"""
7+
CUDA Module
8+
This module provides a collection of classes and utilities for CUDA programming.
9+
"""
10+
from pyccel.ast.core import FunctionCall
11+
12+
__all__ = (
13+
'KernelCall',
14+
)
15+
16+
class KernelCall(FunctionCall):
17+
"""
18+
Represents a kernel function call in the code.
19+
20+
The class serves as a representation of a kernel
21+
function call within the codebase.
22+
23+
Parameters
24+
----------
25+
func : FunctionDef
26+
The definition of the function being called.
27+
28+
args : iterable of FunctionCallArgument
29+
The arguments passed to the function.
30+
31+
num_blocks : TypedAstNode
32+
The number of blocks. These objects must have a primitive type of `PrimitiveIntegerType`.
33+
34+
tp_block : TypedAstNode
35+
The number of threads per block. These objects must have a primitive type of `PrimitiveIntegerType`.
36+
37+
current_function : FunctionDef, optional
38+
The function where the call takes place.
39+
"""
40+
__slots__ = ('_num_blocks','_tp_block')
41+
_attribute_nodes = (*FunctionCall._attribute_nodes, '_num_blocks', '_tp_block')
42+
43+
def __init__(self, func, args, num_blocks, tp_block, current_function = None):
44+
self._num_blocks = num_blocks
45+
self._tp_block = tp_block
46+
super().__init__(func, args, current_function)
47+
48+
@property
49+
def num_blocks(self):
50+
"""
51+
The number of blocks in the kernel being called.
52+
53+
The number of blocks in the kernel being called.
54+
"""
55+
return self._num_blocks
56+
57+
@property
58+
def tp_block(self):
59+
"""
60+
The number of threads per block.
61+
62+
The number of threads per block.
63+
"""
64+
return self._tp_block
65+

pyccel/ast/cudaext.py

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
#!/usr/bin/python
2+
# -*- coding: utf-8 -*-
3+
#------------------------------------------------------------------------------------------#
4+
# This file is part of Pyccel which is released under MIT License. See the LICENSE file or #
5+
# go to https://github.com/pyccel/pyccel/blob/master/LICENSE for full license details. #
6+
#------------------------------------------------------------------------------------------#
7+
"""
8+
CUDA Extension Module
9+
Provides CUDA functionality for code generation.
10+
"""
11+
from .internals import PyccelFunction
12+
13+
from .datatypes import VoidType
14+
from .core import Module, PyccelFunctionDef
15+
16+
__all__ = (
17+
'CudaSynchronize',
18+
)
19+
20+
class CudaSynchronize(PyccelFunction):
21+
"""
22+
Represents a call to Cuda.synchronize for code generation.
23+
24+
This class serves as a representation of the Cuda.synchronize method.
25+
"""
26+
__slots__ = ()
27+
_attribute_nodes = ()
28+
_shape = None
29+
_class_type = VoidType()
30+
def __init__(self):
31+
super().__init__()
32+
33+
cuda_funcs = {
34+
'synchronize' : PyccelFunctionDef('synchronize' , CudaSynchronize),
35+
}
36+
37+
cuda_mod = Module('cuda',
38+
variables=[],
39+
funcs=cuda_funcs.values(),
40+
imports=[]
41+
)
42+

pyccel/ast/utilities.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
from .literals import LiteralInteger, LiteralEllipsis, Nil
2626
from .mathext import math_mod
2727
from .sysext import sys_mod
28+
from .cudaext import cuda_mod
2829

2930
from .numpyext import (NumpyEmpty, NumpyArray, numpy_mod,
3031
NumpyTranspose, NumpyLinspace)
@@ -50,7 +51,8 @@
5051
decorators_mod = Module('decorators',(),
5152
funcs = [PyccelFunctionDef(d, PyccelFunction) for d in pyccel_decorators.__all__])
5253
pyccel_mod = Module('pyccel',(),(),
53-
imports = [Import('decorators', decorators_mod)])
54+
imports = [Import('decorators', decorators_mod),
55+
Import('cuda', cuda_mod)])
5456

5557
# TODO add documentation
5658
builtin_import_registry = Module('__main__',

pyccel/codegen/printing/cucode.py

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,12 @@
99
enabling the direct translation of high-level Pyccel expressions into CUDA code.
1010
"""
1111

12-
from pyccel.codegen.printing.ccode import CCodePrinter, c_library_headers
12+
from pyccel.codegen.printing.ccode import CCodePrinter
1313

14-
from pyccel.ast.core import Import, Module
14+
from pyccel.ast.core import Import, Module
15+
from pyccel.ast.literals import Nil
1516

16-
from pyccel.errors.errors import Errors
17+
from pyccel.errors.errors import Errors
1718

1819

1920
errors = Errors()
@@ -61,6 +62,44 @@ def _print_Module(self, expr):
6162
self.exit_scope()
6263
return code
6364

65+
def function_signature(self, expr, print_arg_names = True):
66+
"""
67+
Get the Cuda representation of the function signature.
68+
69+
Extract from the function definition `expr` all the
70+
information (name, input, output) needed to create the
71+
function signature and return a string describing the
72+
function.
73+
This is not a declaration as the signature does not end
74+
with a semi-colon.
75+
76+
Parameters
77+
----------
78+
expr : FunctionDef
79+
The function definition for which a signature is needed.
80+
81+
print_arg_names : bool, default : True
82+
Indicates whether argument names should be printed.
83+
84+
Returns
85+
-------
86+
str
87+
Signature of the function.
88+
"""
89+
cuda_decorater = '__global__' if 'kernel' in expr.decorators else ''
90+
c_function_signature = super().function_signature(expr, print_arg_names)
91+
return f'{cuda_decorater} {c_function_signature}'
92+
93+
def _print_KernelCall(self, expr):
94+
func = expr.funcdef
95+
args = [a.value or Nil() for a in expr.args]
96+
97+
args = ', '.join(self._print(a) for a in args)
98+
return f"{func.name}<<<{expr.num_blocks}, {expr.tp_block}>>>({args});\n"
99+
100+
def _print_CudaSynchronize(self, expr):
101+
return 'cudaDeviceSynchronize();\n'
102+
64103
def _print_ModuleHeader(self, expr):
65104
self.set_scope(expr.module.scope)
66105
self._in_header = True
@@ -87,6 +126,7 @@ def _print_ModuleHeader(self, expr):
87126
}}\n'
88127
return '\n'.join((f"#ifndef {name.upper()}_H",
89128
f"#define {name.upper()}_H",
129+
imports,
90130
global_variables,
91131
function_declaration,
92132
"#endif // {name.upper()}_H\n"))

pyccel/cuda/__init__.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#------------------------------------------------------------------------------------------#
2+
# This file is part of Pyccel which is released under MIT License. See the LICENSE file or #
3+
# go to https://github.com/pyccel/pyccel/blob/master/LICENSE for full license details. #
4+
#------------------------------------------------------------------------------------------#
5+
"""
6+
This module is for exposing the CudaSubmodule functions.
7+
"""
8+
from .cuda_sync_primitives import synchronize
9+
10+
__all__ = ['synchronize']
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
#------------------------------------------------------------------------------------------#
2+
# This file is part of Pyccel which is released under MIT License. See the LICENSE file or #
3+
# go to https://github.com/pyccel/pyccel/blob/master/LICENSE for full license details. #
4+
#------------------------------------------------------------------------------------------#
5+
"""
6+
This submodule contains CUDA methods for Pyccel.
7+
"""
8+
9+
10+
def synchronize():
11+
"""
12+
Synchronize CUDA device execution.
13+
14+
Synchronize CUDA device execution.
15+
"""
16+

0 commit comments

Comments
 (0)