Skip to content

Commit 8792726

Browse files
update design documentation
1 parent cc3dc65 commit 8792726

File tree

3 files changed

+17
-10
lines changed

3 files changed

+17
-10
lines changed

doc/source/architecture_0.4.3.png

46.5 KB
Loading

doc/source/design.png

-41.8 KB
Binary file not shown.

doc/source/design.rst

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The Kernel Tuner is designed to be extensible and support
1212
different search and execution strategies. The current architecture of
1313
the Kernel Tuner can be seen as:
1414

15-
.. image:: design.png
15+
.. image:: architecture_0.4.3.png
1616
:width: 500pt
1717

1818
At the top we have the kernel code and the Python script that tunes it,
@@ -33,32 +33,33 @@ the only supported runner, which does exactly what its name says. It compiles
3333
and benchmarks configurations using a single sequential Python process.
3434
Other runners are foreseen in future releases.
3535

36-
The runners are implemented on top of a high-level *Device Interface*,
36+
The runners are implemented on top of the core, which implements a
37+
high-level *Device Interface*,
3738
which wraps all the functionality for compiling and benchmarking
3839
kernel configurations based on the low-level *Device Function Interface*.
3940
Currently, we have
40-
four different implementations of the device function interface, which
41+
five different implementations of the device function interface, which
4142
basically abstracts the different backends into a set of simple
4243
functions such as ``ready_argument_list`` which allocates GPU memory and
4344
moves data to the GPU, and functions like ``compile``, ``benchmark``, or
4445
``run_kernel``. The functions in the core are basically the main
4546
building blocks for implementing runners.
4647

47-
At the bottom, three of the backends are shown.
48-
PyCUDA and PyOpenCL are for tuning either CUDA or OpenCL kernels.
49-
A relatively new addition is the Cupy backend based on Cupy for tuning
50-
CUDA kernels using the NVRTC compiler.
48+
The observers are explained in :ref:`observers`.
49+
50+
At the bottom, the backends are shown.
51+
PyCUDA, CuPy, cuda-python and PyOpenCL are for tuning either CUDA or OpenCL kernels.
5152
The C
5253
Functions implementation can actually call any compiler, typically NVCC
53-
or GCC is used. This backend was created not just to be able to tune C
54-
functions, but mostly to tune C functions that in turn launch GPU kernels.
54+
or GCC is used. There is limited support for tuning Fortran kernels.
55+
This backend was created not just to be able to tune C
56+
functions, but in particular to tune C functions that in turn launch GPU kernels.
5557

5658
The rest of this section contains the API documentation of the modules
5759
discussed above. For the documentation of the user API see the
5860
:doc:`user-api`.
5961

6062

61-
6263
Strategies
6364
----------
6465

@@ -109,6 +110,12 @@ kernel_tuner.cupy.CupyFunctions
109110
:special-members: __init__
110111
:members:
111112

113+
kernel_tuner.nvcuda.CudaFunctions
114+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
115+
.. autoclass:: kernel_tuner.nvcuda.CudaFunctions
116+
:special-members: __init__
117+
:members:
118+
112119
kernel_tuner.opencl.OpenCLFunctions
113120
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
114121
.. autoclass:: kernel_tuner.opencl.OpenCLFunctions

0 commit comments

Comments
 (0)