Skip to content

Commit 9e17c46

Browse files
authored
add new_op_kernel_en doc (#7681)
* add new_op_kernel_en.md
1 parent fccab36 commit 9e17c46

File tree

1 file changed

+121
-0
lines changed

1 file changed

+121
-0
lines changed

doc/howto/dev/new_op_kernel_en.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
## Add Kernels for a New Device
2+
3+
### Background
4+
5+
PaddlePaddle Fluid have hundreds of operators. Each operator could have one or more kernels. A kernel is an implementation of the operator for a certain device, which could be a hardware device, e.g., the CUDA GPU, or a library that utilizes a device, e.g., Intel MKL that makes full use of the Xeon CPU.
6+
7+
[This document](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/new_op_en.md) explains how to add an operator, and its kernels. The kernels of an operator are indexed by a C++ type [`OpKernelType`](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/operator_kernel_type.md). An operator chooses the right kernel at runtime. This choosing mechanism is described [here](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/switch_kernel.md).
8+
9+
### Write Kernels for A New Device
10+
11+
#### Add A New Device
12+
13+
For some historical reaons, we misuse the word *library* for *device*. For example, we call the deivce type by *library type*. An example is the header file [`library_type.h`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/library_type.h#L24). We will correct this ASAP.
14+
15+
To register a new device, we need to add an enum value to `LibraryType`:
16+
17+
```
18+
enum class LibraryType {
19+
kPlain = 0,
20+
kMKLDNN = 1,
21+
kCUDNN = 2,
22+
};
23+
```
24+
25+
26+
#### Add A New [Place](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/place.h#L53)
27+
28+
If you have a new kind of Device, firstly you need to add a new kind of [`Place`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/place.h#L53). For example `CUDAPlace`:
29+
30+
```cpp
31+
struct CUDAPlace {
32+
CUDAPlace() : CUDAPlace(0) {}
33+
explicit CUDAPlace(int d) : device(d) {}
34+
35+
inline int GetDeviceId() const { return device; }
36+
// needed for variant equality comparison
37+
inline bool operator==(const CUDAPlace &o) const {
38+
return device == o.device;
39+
}
40+
inline bool operator!=(const CUDAPlace &o) const { return !(*this == o); }
41+
42+
int device;
43+
};
44+
45+
typedef boost::variant<CUDAPlace, CPUPlace> Place;
46+
```
47+
48+
#### Add [device context]((https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/device_context.h#L37))
49+
After a new kind of Device is added, you should add a corresponding [DeviceContext](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/device_context.h#L37) for it.
50+
51+
```cpp
52+
class DeviceContext {
53+
public:
54+
virtual ~DeviceContext() {}
55+
virtual Place GetPlace() const = 0;
56+
57+
virtual void Wait() const {}
58+
};
59+
```
60+
61+
#### Implement new [OpKernel](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/operator.h#L351) for your Device.
62+
63+
A detailed documentation can be found in [`new_op_and_kernel`](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/new_op_en.md)
64+
65+
```cpp
66+
class OpKernelBase {
67+
public:
68+
/**
69+
* ExecutionContext is the only parameter of Kernel Run function.
70+
* Run will get input/output variables, state such as momentum and
71+
* device resource such as CUDA stream, cublas handle, etc. from
72+
* ExecutionContext. User should construct it before run the Operator.
73+
*/
74+
75+
virtual void Compute(const ExecutionContext& context) const = 0;
76+
77+
virtual ~OpKernelBase() = default;
78+
};
79+
80+
template <typename T>
81+
class OpKernel : public OpKernelBase {
82+
public:
83+
using ELEMENT_TYPE = T;
84+
};
85+
```
86+
87+
88+
#### Register the OpKernel to framework
89+
90+
After writing the components described above, we should register the kernel to the framework.
91+
92+
We use `REGISTER_OP_KERNEL` to do the registration.
93+
94+
```cpp
95+
REGISTER_OP_KERNEL(
96+
op_type,
97+
library_type,
98+
place_type,
99+
kernel0, kernel1, ...)
100+
```
101+
102+
kernel0, kernel1 are kernels that have the same `op_type`, `library_type`, `place_type` but different `data_types`.
103+
104+
take [`conv2d`]((https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/conv_cudnn_op.cu.cc#L318)) as an example:
105+
106+
```cpp
107+
REGISTER_OP_KERNEL(conv2d, CPU, paddle::platform::CPUPlace,
108+
paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>,
109+
paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, double>);
110+
111+
REGISTER_OP_KERNEL(conv2d, CUDNN, ::paddle::platform::CUDAPlace,
112+
paddle::operators::CUDNNConvOpKernel<float>,
113+
paddle::operators::CUDNNConvOpKernel<double>);
114+
```
115+
116+
In the code above:
117+
118+
- `conv2d` is the type/name of the operator
119+
- `CUDNN/CPU` is `library`
120+
- `paddle::platform::CUDAPlace/CPUPlace` is `place`
121+
- template parameter `float/double` on `CUDNNConvOpKernel<T>` is `data_type`.

0 commit comments

Comments
 (0)