Skip to content

Commit 795bd9b

Browse files
committed
Merge tag 'drm-accel-2022-11-22' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel into drm-next
This tag contains the patches that add the new compute acceleration subsystem, which is part of the DRM subsystem. The patches: - Add a new directory at drivers/accel. - Add a new major (261) for compute accelerators. - Add a new DRM minor type for compute accelerators. - Integrate the accel core code with DRM core code. - Add documentation for the accel subsystem. Signed-off-by: Dave Airlie <[email protected]> some acks from the list (some are in the patch series): Acked-by: Daniel Stone <[email protected]> Acked-by: Sonal Santan <[email protected]> Acked-by: Maxime Ripard <[email protected]> Acked-by: Jacek Lawrynowicz <[email protected]> Tested-by: Jacek Lawrynowicz <[email protected]> Acked-by: Alex Deucher <[email protected]> Acked-by: Thomas Zimmermann <[email protected]> From: Oded Gabbay <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2 parents 2847b66 + 8c5577a commit 795bd9b

File tree

16 files changed

+711
-37
lines changed

16 files changed

+711
-37
lines changed

Documentation/accel/index.rst

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
====================
4+
Compute Accelerators
5+
====================
6+
7+
.. toctree::
8+
:maxdepth: 1
9+
10+
introduction
11+
12+
.. only:: subproject and html
13+
14+
Indices
15+
=======
16+
17+
* :ref:`genindex`

Documentation/accel/introduction.rst

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
============
4+
Introduction
5+
============
6+
7+
The Linux compute accelerators subsystem is designed to expose compute
8+
accelerators in a common way to user-space and provide a common set of
9+
functionality.
10+
11+
These devices can be either stand-alone ASICs or IP blocks inside an SoC/GPU.
12+
Although these devices are typically designed to accelerate
13+
Machine-Learning (ML) and/or Deep-Learning (DL) computations, the accel layer
14+
is not limited to handling these types of accelerators.
15+
16+
Typically, a compute accelerator will belong to one of the following
17+
categories:
18+
19+
- Edge AI - doing inference at an edge device. It can be an embedded ASIC/FPGA,
20+
or an IP inside a SoC (e.g. laptop web camera). These devices
21+
are typically configured using registers and can work with or without DMA.
22+
23+
- Inference data-center - single/multi user devices in a large server. This
24+
type of device can be stand-alone or an IP inside a SoC or a GPU. It will
25+
have on-board DRAM (to hold the DL topology), DMA engines and
26+
command submission queues (either kernel or user-space queues).
27+
It might also have an MMU to manage multiple users and might also enable
28+
virtualization (SR-IOV) to support multiple VMs on the same device. In
29+
addition, these devices will usually have some tools, such as profiler and
30+
debugger.
31+
32+
- Training data-center - Similar to Inference data-center cards, but typically
33+
have more computational power and memory b/w (e.g. HBM) and will likely have
34+
a method of scaling-up/out, i.e. connecting to other training cards inside
35+
the server or in other servers, respectively.
36+
37+
All these devices typically have different runtime user-space software stacks,
38+
that are tailored-made to their h/w. In addition, they will also probably
39+
include a compiler to generate programs to their custom-made computational
40+
engines. Typically, the common layer in user-space will be the DL frameworks,
41+
such as PyTorch and TensorFlow.
42+
43+
Sharing code with DRM
44+
=====================
45+
46+
Because this type of devices can be an IP inside GPUs or have similar
47+
characteristics as those of GPUs, the accel subsystem will use the
48+
DRM subsystem's code and functionality. i.e. the accel core code will
49+
be part of the DRM subsystem and an accel device will be a new type of DRM
50+
device.
51+
52+
This will allow us to leverage the extensive DRM code-base and
53+
collaborate with DRM developers that have experience with this type of
54+
devices. In addition, new features that will be added for the accelerator
55+
drivers can be of use to GPU drivers as well.
56+
57+
Differentiation from GPUs
58+
=========================
59+
60+
Because we want to prevent the extensive user-space graphic software stack
61+
from trying to use an accelerator as a GPU, the compute accelerators will be
62+
differentiated from GPUs by using a new major number and new device char files.
63+
64+
Furthermore, the drivers will be located in a separate place in the kernel
65+
tree - drivers/accel/.
66+
67+
The accelerator devices will be exposed to the user space with the dedicated
68+
261 major number and will have the following convention:
69+
70+
- device char files - /dev/accel/accel*
71+
- sysfs - /sys/class/accel/accel*/
72+
- debugfs - /sys/kernel/debug/accel/accel*/
73+
74+
Getting Started
75+
===============
76+
77+
First, read the DRM documentation at Documentation/gpu/index.rst.
78+
Not only it will explain how to write a new DRM driver but it will also
79+
contain all the information on how to contribute, the Code Of Conduct and
80+
what is the coding style/documentation. All of that is the same for the
81+
accel subsystem.
82+
83+
Second, make sure the kernel is configured with CONFIG_DRM_ACCEL.
84+
85+
To expose your device as an accelerator, two changes are needed to
86+
be done in your driver (as opposed to a standard DRM driver):
87+
88+
- Add the DRIVER_COMPUTE_ACCEL feature flag in your drm_driver's
89+
driver_features field. It is important to note that this driver feature is
90+
mutually exclusive with DRIVER_RENDER and DRIVER_MODESET. Devices that want
91+
to expose both graphics and compute device char files should be handled by
92+
two drivers that are connected using the auxiliary bus framework.
93+
94+
- Change the open callback in your driver fops structure to accel_open().
95+
Alternatively, your driver can use DEFINE_DRM_ACCEL_FOPS macro to easily
96+
set the correct function operations pointers structure.
97+
98+
External References
99+
===================
100+
101+
email threads
102+
-------------
103+
104+
* `Initial discussion on the New subsystem for acceleration devices <https://lkml.org/lkml/2022/7/31/83>`_ - Oded Gabbay (2022)
105+
* `patch-set to add the new subsystem <https://lkml.org/lkml/2022/10/22/544>`_ - Oded Gabbay (2022)
106+
107+
Conference talks
108+
----------------
109+
110+
* `LPC 2022 Accelerators BOF outcomes summary <https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html>`_ - Dave Airlie (2022)

Documentation/admin-guide/devices.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3080,6 +3080,11 @@
30803080
...
30813081
255 = /dev/osd255 256th OSD Device
30823082

3083+
261 char Compute Acceleration Devices
3084+
0 = /dev/accel/accel0 First acceleration device
3085+
1 = /dev/accel/accel1 Second acceleration device
3086+
...
3087+
30833088
384-511 char RESERVED FOR DYNAMIC ASSIGNMENT
30843089
Character devices that request a dynamic allocation of major
30853090
number will take numbers starting from 511 and downward,

Documentation/subsystem-apis.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ needed).
4343
input/index
4444
hwmon/index
4545
gpu/index
46+
accel/index
4647
security/index
4748
sound/index
4849
crypto/index

MAINTAINERS

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6836,6 +6836,15 @@ F: include/drm/drm*
68366836
F: include/linux/vga*
68376837
F: include/uapi/drm/drm*
68386838

6839+
DRM COMPUTE ACCELERATORS DRIVERS AND FRAMEWORK
6840+
M: Oded Gabbay <[email protected]>
6841+
6842+
S: Maintained
6843+
C: irc://irc.oftc.net/dri-devel
6844+
T: git https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git
6845+
F: Documentation/accel/
6846+
F: drivers/accel/
6847+
68396848
DRM DRIVERS FOR ALLWINNER A10
68406849
M: Maxime Ripard <[email protected]>
68416850
M: Chen-Yu Tsai <[email protected]>

drivers/Kconfig

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,8 @@ source "drivers/media/Kconfig"
9999

100100
source "drivers/video/Kconfig"
101101

102+
source "drivers/accel/Kconfig"
103+
102104
source "sound/Kconfig"
103105

104106
source "drivers/hid/Kconfig"

drivers/accel/Kconfig

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# SPDX-License-Identifier: GPL-2.0-only
2+
#
3+
# Compute Acceleration device configuration
4+
#
5+
# This framework provides support for compute acceleration devices, such
6+
# as, but not limited to, Machine-Learning and Deep-Learning acceleration
7+
# devices
8+
#
9+
menuconfig DRM_ACCEL
10+
bool "Compute Acceleration Framework"
11+
depends on DRM
12+
help
13+
Framework for device drivers of compute acceleration devices, such
14+
as, but not limited to, Machine-Learning and Deep-Learning
15+
acceleration devices.
16+
If you say Y here, you need to select the module that's right for
17+
your acceleration device from the list below.
18+
This framework is integrated with the DRM subsystem as compute
19+
accelerators and GPUs share a lot in common and can use almost the
20+
same infrastructure code.
21+
Having said that, acceleration devices will have a different
22+
major number than GPUs, and will be exposed to user-space using
23+
different device files, called accel/accel* (in /dev, sysfs
24+
and debugfs).

0 commit comments

Comments
 (0)