|
| 1 | +========================== |
| 2 | +Xe – Merge Acceptance Plan |
| 3 | +========================== |
| 4 | +Xe is a new driver for Intel GPUs that supports both integrated and |
| 5 | +discrete platforms starting with Tiger Lake (first Intel Xe Architecture). |
| 6 | + |
| 7 | +This document aims to establish a merge plan for the Xe, by writing down clear |
| 8 | +pre-merge goals, in order to avoid unnecessary delays. |
| 9 | + |
| 10 | +Xe – Overview |
| 11 | +============= |
| 12 | +The main motivation of Xe is to have a fresh base to work from that is |
| 13 | +unencumbered by older platforms, whilst also taking the opportunity to |
| 14 | +rearchitect our driver to increase sharing across the drm subsystem, both |
| 15 | +leveraging and allowing us to contribute more towards other shared components |
| 16 | +like TTM and drm/scheduler. |
| 17 | + |
| 18 | +This is also an opportunity to start from the beginning with a clean uAPI that is |
| 19 | +extensible by design and already aligned with the modern userspace needs. For |
| 20 | +this reason, the memory model is solely based on GPU Virtual Address space |
| 21 | +bind/unbind (‘VM_BIND’) of GEM buffer objects (BOs) and execution only supporting |
| 22 | +explicit synchronization. With persistent mapping across the execution, the |
| 23 | +userspace does not need to provide a list of all required mappings during each |
| 24 | +submission. |
| 25 | + |
| 26 | +The new driver leverages a lot from i915. As for display, the intent is to share |
| 27 | +the display code with the i915 driver so that there is maximum reuse there. |
| 28 | + |
| 29 | +As for the power management area, the goal is to have a much-simplified support |
| 30 | +for the system suspend states (S-states), PCI device suspend states (D-states), |
| 31 | +GPU/Render suspend states (R-states) and frequency management. It should leverage |
| 32 | +as much as possible all the existent PCI-subsystem infrastructure (pm and |
| 33 | +runtime_pm) and underlying firmware components such PCODE and GuC for the power |
| 34 | +states and frequency decisions. |
| 35 | + |
| 36 | +Repository: |
| 37 | + |
| 38 | +https://gitlab.freedesktop.org/drm/xe/kernel (branch drm-xe-next) |
| 39 | + |
| 40 | +Xe – Platforms |
| 41 | +============== |
| 42 | +Currently, Xe is already functional and has experimental support for multiple |
| 43 | +platforms starting from Tiger Lake, with initial support in userspace implemented |
| 44 | +in Mesa (for Iris and Anv, our OpenGL and Vulkan drivers), as well as in NEO |
| 45 | +(for OpenCL and Level0). |
| 46 | + |
| 47 | +During a transition period, platforms will be supported by both Xe and i915. |
| 48 | +However, the force_probe mechanism existent in both drivers will allow only one |
| 49 | +official and by-default probe at a given time. |
| 50 | + |
| 51 | +For instance, in order to probe a DG2 which PCI ID is 0x5690 by Xe instead of |
| 52 | +i915, the following set of parameters need to be used: |
| 53 | + |
| 54 | +``` |
| 55 | +i915.force_probe=!5690 xe.force_probe=5690 |
| 56 | +``` |
| 57 | + |
| 58 | +In both drivers, the ‘.require_force_probe’ protection forces the user to use the |
| 59 | +force_probe parameter while the driver is under development. This protection is |
| 60 | +only removed when the support for the platform and the uAPI are stable. Stability |
| 61 | +which needs to be demonstrated by CI results. |
| 62 | + |
| 63 | +In order to avoid user space regressions, i915 will continue to support all the |
| 64 | +current platforms that are already out of this protection. Xe support will be |
| 65 | +forever experimental and dependent on the usage of force_probe for these |
| 66 | +platforms. |
| 67 | + |
| 68 | +When the time comes for Xe, the protection will be lifted on Xe and kept in i915. |
| 69 | + |
| 70 | +Xe driver will be protected with both STAGING Kconfig and force_probe. Changes in |
| 71 | +the uAPI are expected while the driver is behind these protections. STAGING will |
| 72 | +be removed when the driver uAPI gets to a mature state where we can guarantee the |
| 73 | +‘no regression’ rule. Then force_probe will be lifted only for future platforms |
| 74 | +that will be productized with Xe driver, but not with i915. |
| 75 | + |
| 76 | +Xe – Pre-Merge Goals |
| 77 | +==================== |
| 78 | + |
| 79 | +Drm_scheduler |
| 80 | +------------- |
| 81 | +Xe primarily uses Firmware based scheduling (GuC FW). However, it will use |
| 82 | +drm_scheduler as the scheduler ‘frontend’ for userspace submission in order to |
| 83 | +resolve syncobj and dma-buf implicit sync dependencies. However, drm_scheduler is |
| 84 | +not yet prepared to handle the 1-to-1 relationship between drm_gpu_scheduler and |
| 85 | +drm_sched_entity. |
| 86 | + |
| 87 | +Deeper changes to drm_scheduler should *not* be required to get Xe accepted, but |
| 88 | +some consensus needs to be reached between Xe and other community drivers that |
| 89 | +could also benefit from this work, for coupling FW based/assisted submission such |
| 90 | +as the ARM’s new Mali GPU driver, and others. |
| 91 | + |
| 92 | +As a key measurable result, the patch series introducing Xe itself shall not |
| 93 | +depend on any other patch touching drm_scheduler itself that was not yet merged |
| 94 | +through drm-misc. This, by itself, already includes the reach of an agreement for |
| 95 | +uniform 1 to 1 relationship implementation / usage across drivers. |
| 96 | + |
| 97 | +GPU VA |
| 98 | +------ |
| 99 | +Two main goals of Xe are meeting together here: |
| 100 | + |
| 101 | +1) Have an uAPI that aligns with modern UMD needs. |
| 102 | + |
| 103 | +2) Early upstream engagement. |
| 104 | + |
| 105 | +RedHat engineers working on Nouveau proposed a new DRM feature to handle keeping |
| 106 | +track of GPU virtual address mappings. This is still not merged upstream, but |
| 107 | +this aligns very well with our goals and with our VM_BIND. The engagement with |
| 108 | +upstream and the port of Xe towards GPUVA is already ongoing. |
| 109 | + |
| 110 | +As a key measurable result, Xe needs to be aligned with the GPU VA and working in |
| 111 | +our tree. Missing Nouveau patches should *not* block Xe and any needed GPUVA |
| 112 | +related patch should be independent and present on dri-devel or acked by |
| 113 | +maintainers to go along with the first Xe pull request towards drm-next. |
| 114 | + |
| 115 | +DRM_VM_BIND |
| 116 | +----------- |
| 117 | +Nouveau, and Xe are all implementing ‘VM_BIND’ and new ‘Exec’ uAPIs in order to |
| 118 | +fulfill the needs of the modern uAPI. Xe merge should *not* be blocked on the |
| 119 | +development of a common new drm_infrastructure. However, the Xe team needs to |
| 120 | +engage with the community to explore the options of a common API. |
| 121 | + |
| 122 | +As a key measurable result, the DRM_VM_BIND needs to be documented in this file |
| 123 | +below, or this entire block deleted if the consensus is for independent drivers |
| 124 | +vm_bind ioctls. |
| 125 | + |
| 126 | +Although having a common DRM level IOCTL for VM_BIND is not a requirement to get |
| 127 | +Xe merged, it is mandatory to enforce the overall locking scheme for all major |
| 128 | +structs and list (so vm and vma). So, a consensus is needed, and possibly some |
| 129 | +common helpers. If helpers are needed, they should be also documented in this |
| 130 | +document. |
| 131 | + |
| 132 | +ASYNC VM_BIND |
| 133 | +------------- |
| 134 | +Although having a common DRM level IOCTL for VM_BIND is not a requirement to get |
| 135 | +Xe merged, it is mandatory to have a consensus with other drivers and Mesa. |
| 136 | +It needs to be clear how to handle async VM_BIND and interactions with userspace |
| 137 | +memory fences. Ideally with helper support so people don't get it wrong in all |
| 138 | +possible ways. |
| 139 | + |
| 140 | +As a key measurable result, the benefits of ASYNC VM_BIND and a discussion of |
| 141 | +various flavors, error handling and a sample API should be documented here or in |
| 142 | +a separate document pointed to by this document. |
| 143 | + |
| 144 | +Userptr integration and vm_bind |
| 145 | +------------------------------- |
| 146 | +Different drivers implement different ways of dealing with execution of userptr. |
| 147 | +With multiple drivers currently introducing support to VM_BIND, the goal is to |
| 148 | +aim for a DRM consensus on what’s the best way to have that support. To some |
| 149 | +extent this is already getting addressed itself with the GPUVA where likely the |
| 150 | +userptr will be a GPUVA with a NULL GEM call VM bind directly on the userptr. |
| 151 | +However, there are more aspects around the rules for that and the usage of |
| 152 | +mmu_notifiers, locking and other aspects. |
| 153 | + |
| 154 | +This task here has the goal of introducing a documentation of the basic rules. |
| 155 | + |
| 156 | +The documentation *needs* to first live in this document (API session below) and |
| 157 | +then moved to another more specific document or at Xe level or at DRM level. |
| 158 | + |
| 159 | +Documentation should include: |
| 160 | + |
| 161 | + * The userptr part of the VM_BIND api. |
| 162 | + |
| 163 | + * Locking, including the page-faulting case. |
| 164 | + |
| 165 | + * O(1) complexity under VM_BIND. |
| 166 | + |
| 167 | +Some parts of userptr like mmu_notifiers should become GPUVA or DRM helpers when |
| 168 | +the second driver supporting VM_BIND+userptr appears. Details to be defined when |
| 169 | +the time comes. |
| 170 | + |
| 171 | +Long running compute: minimal data structure/scaffolding |
| 172 | +-------------------------------------------------------- |
| 173 | +The generic scheduler code needs to include the handling of endless compute |
| 174 | +contexts, with the minimal scaffolding for preempt-ctx fences (probably on the |
| 175 | +drm_sched_entity) and making sure drm_scheduler can cope with the lack of job |
| 176 | +completion fence. |
| 177 | + |
| 178 | +The goal is to achieve a consensus ahead of Xe initial pull-request, ideally with |
| 179 | +this minimal drm/scheduler work, if needed, merged to drm-misc in a way that any |
| 180 | +drm driver, including Xe, could re-use and add their own individual needs on top |
| 181 | +in a next stage. However, this should not block the initial merge. |
| 182 | + |
| 183 | +This is a non-blocker item since the driver without the support for the long |
| 184 | +running compute enabled is not a showstopper. |
| 185 | + |
| 186 | +Display integration with i915 |
| 187 | +----------------------------- |
| 188 | +In order to share the display code with the i915 driver so that there is maximum |
| 189 | +reuse, the i915/display/ code is built twice, once for i915.ko and then for |
| 190 | +xe.ko. Currently, the i915/display code in Xe tree is polluted with many 'ifdefs' |
| 191 | +depending on the build target. The goal is to refactor both Xe and i915/display |
| 192 | +code simultaneously in order to get a clean result before they land upstream, so |
| 193 | +that display can already be part of the initial pull request towards drm-next. |
| 194 | + |
| 195 | +However, display code should not gate the acceptance of Xe in upstream. Xe |
| 196 | +patches will be refactored in a way that display code can be removed, if needed, |
| 197 | +from the first pull request of Xe towards drm-next. The expectation is that when |
| 198 | +both drivers are part of the drm-tip, the introduction of cleaner patches will be |
| 199 | +easier and speed up. |
| 200 | + |
| 201 | +Drm_exec |
| 202 | +-------- |
| 203 | +Helper to make dma_resv locking for a big number of buffers is getting removed in |
| 204 | +the drm_exec series proposed in https://patchwork.freedesktop.org/patch/524376/ |
| 205 | +If that happens, Xe needs to change and incorporate the changes in the driver. |
| 206 | +The goal is to engage with the Community to understand if the best approach is to |
| 207 | +move that to the drivers that are using it or if we should keep the helpers in |
| 208 | +place waiting for Xe to get merged. |
| 209 | + |
| 210 | +This item ties into the GPUVA, VM_BIND, and even long-running compute support. |
| 211 | + |
| 212 | +As a key measurable result, we need to have a community consensus documented in |
| 213 | +this document and the Xe driver prepared for the changes, if necessary. |
| 214 | + |
| 215 | +Dev_coredump |
| 216 | +------------ |
| 217 | + |
| 218 | +Xe needs to align with other drivers on the way that the error states are |
| 219 | +dumped, avoiding a Xe only error_state solution. The goal is to use devcoredump |
| 220 | +infrastructure to report error states, since it produces a standardized way |
| 221 | +by exposing a virtual and temporary /sys/class/devcoredump device. |
| 222 | + |
| 223 | +As the key measurable result, Xe driver needs to provide GPU snapshots captured |
| 224 | +at hang time through devcoredump, but without depending on any core modification |
| 225 | +of devcoredump infrastructure itself. |
| 226 | + |
| 227 | +Later, when we are in-tree, the goal is to collaborate with devcoredump |
| 228 | +infrastructure with overall possible improvements, like multiple file support |
| 229 | +for better organization of the dumps, snapshot support, dmesg extra print, |
| 230 | +and whatever may make sense and help the overall infrastructure. |
| 231 | + |
| 232 | +Xe – uAPI high level overview |
| 233 | +============================= |
| 234 | + |
| 235 | +...Warning: To be done in follow up patches after/when/where the main consensus in various items are individually reached. |
0 commit comments