Skip to content

Commit 4719acd

Browse files
committed
Update documentation
1 parent e13d97d commit 4719acd

File tree

2 files changed

+28
-9
lines changed

2 files changed

+28
-9
lines changed

doc/source/hostcode.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ There are few differences with tuning just a single CUDA or OpenCL kernel, to li
1515
* You have to specify the lang="C" option
1616
* The C function should return a ``float``
1717
* You have to do your own timing and error handling in C
18+
* Data is not automatically copied to and from device memory. To use an array in host memory, pass in a :mod:`numpy` array. To use an array
19+
in device memory, pass in a :mod:`cupy` array.
1820

1921
You have to specify the language as "C" because the Kernel Tuner will be calling a host function. This means that the Kernel
2022
Tuner will have to interface with C and in fact uses a different backend. This also means you can use this way of tuning
@@ -94,7 +96,7 @@ compiled C code. This way, you don't have to compute the grid size in C, you can
9496

9597
The filter is not passed separately as a constant memory argument, because the CudaMemcpyToSymbol operation is now performed by the C host function. Also,
9698
because the code is compiled differently, we have no direct reference to the compiled module that is uploaded to the device and therefore we can not perform this
97-
operation directly from Python. If you are tuning host code, you have to perform all memory allocations, frees, and memcpy operations inside the C host code,
99+
operation directly from Python. If you are tuning host code, you have the option to perform all memory allocations, frees, and memcpy operations inside the C host code,
98100
that's the purpose of host code after all. That is also why you have to do the timing yourself in C, as you may not want to include the time spent on memory
99101
allocations and other setup into your time measurements.
100102

kernel_tuner/backends/compiler.py

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -28,12 +28,29 @@
2828

2929

3030
def is_cupy_array(array):
31-
"""Check if something is a cupy array."""
31+
"""Check if something is a cupy array.
32+
33+
:param array: A Python object.
34+
:type array: typing.Any
35+
36+
:returns: True if cupy can be imported and the object is a cupy.ndarray.
37+
:rtype: bool
38+
"""
3239
return cp is not None and isinstance(array, cp.ndarray)
3340

3441

3542
def get_array_module(*args):
36-
"""Return the array module for arguments."""
43+
"""Return the array module for arguments.
44+
45+
This function is used to implement CPU/GPU generic code. If the cupy module can be imported
46+
and at least one of the arguments is a cupy.ndarray object, the cupy module is returned.
47+
48+
:param args: Values to determine whether NumPy or CuPy should be used.
49+
:type args: numpy.ndarray or cupy.ndarray
50+
51+
:returns: cupy or numpy is returned based on the types of the arguments.
52+
:rtype: types.ModuleType
53+
"""
3754
return np if cp is None else cp.get_array_module(*args)
3855

3956

@@ -119,8 +136,8 @@ def ready_argument_list(self, arguments):
119136
120137
:param arguments: List of arguments to be passed to the C function.
121138
The order should match the argument list on the C function.
122-
Allowed values are np.ndarray, and/or np.int32, np.float32, and so on.
123-
:type arguments: list(numpy objects)
139+
Allowed values are np.ndarray, cupy.ndarray, and/or np.int32, np.float32, and so on.
140+
:type arguments: list(numpy or cupy objects)
124141
125142
:returns: A list of arguments that can be passed to the C function.
126143
:rtype: list(Argument)
@@ -352,8 +369,8 @@ def memset(self, allocation, value, size):
352369
def memcpy_dtoh(self, dest, src):
353370
"""a simple memcpy copying from an Argument to a numpy array
354371
355-
:param dest: A numpy array to store the data
356-
:type dest: np.ndarray
372+
:param dest: A numpy or cupy array to store the data
373+
:type dest: np.ndarray or cupy.ndarray
357374
358375
:param src: An Argument for some memory allocation
359376
:type src: Argument
@@ -372,8 +389,8 @@ def memcpy_htod(self, dest, src):
372389
:param dest: An Argument for some memory allocation
373390
:type dest: Argument
374391
375-
:param src: A numpy array containing the source data
376-
:type src: np.ndarray
392+
:param src: A numpy or cupy array containing the source data
393+
:type src: np.ndarray or cupy.ndarray
377394
"""
378395
if isinstance(dest.numpy, np.ndarray) and is_cupy_array(src):
379396
# Implicit conversion to a NumPy array is not allowed.

0 commit comments

Comments
 (0)