Skip to content

Commit ccd1950

Browse files
committed
Merge tag 'drm-intel-gt-next-2021-05-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes: - Add reworked uAPI for DG1 behind CONFIG_BROKEN (Matt A, Abdiel) Driver Changes: - Fix for Gitlab issues #3293 and #3450: Avoid kernel crash on older L-shape memory machines - Add Wa_14010733141 (VDBox SFC reset) for Gen11+ (Aditya) - Fix crash in auto_retire active retire callback due to misalignment (Stephane) - Fix overlay active retire callback alignment (Tvrtko) - Eliminate need to align active retire callbacks (Matt A, Ville, Daniel) - Program FF_MODE2 tuning value for all Gen12 platforms (Caz) - Add Wa_14011060649 for TGL,RKL,DG1 and ADLS (Swathi) - Create stolen memory region from local memory on DG1 (CQ) - Place PD in LMEM on dGFX (Matt A) - Use WC when default state object is allocated in LMEM (Venkata) - Determine the coherent map type based on object location (Venkata) - Use lmem physical addresses for fb_mmap() on discrete (Mohammed) - Bypass aperture on fbdev when LMEM is available (Anusha) - Return error value when displayable BO not in LMEM for dGFX (Mohammed) - Do release kernel context if breadcrumb measure fails (Janusz) - Hide modparams for compiled-out features (Tvrtko) - Apply Wa_22010271021 for all Gen11 platforms (Caz) - Fix unlikely ref count race in arming the watchdog timer (Tvrtko) - Check actual RC6 enable status in PMU (Tvrtko) - Fix a double free in gen8_preallocate_top_level_pdp (Lv) - Use trylock in shrinker for GGTT on BSW VT-d and BXT (Maarten) - Remove erroneous i915_is_ggtt check for I915_GEM_OBJECT_UNBIND_VM_TRYLOCK (Maarten) - Convert uAPI headers to real kerneldoc (Matt A) - Clean up kerneldoc warnings headers (Matt A, Maarten) - Fail driver if LMEM training failed (Matt R) - Avoid div-by-zero on Gen2 (Ville) - Read C0DRB3/C1DRB3 as 16 bits again and add _BW suffix (Ville) - Remove reference to struct drm_device.pdev (Thomas) - Increase separation between GuC and execlists code (Chris, Matt B) - Use might_alloc() (Bernard) - Split DGFX_FEATURES from GEN12_FEATURES (Lucas) - Deduplicate Wa_22010271021 programming on (Jose) - Drop duplicate WaDisable4x2SubspanOptimization:hsw (Tvrtko) - Selftest improvements (Chris, Hsin-Yi, Tvrtko) - Shuffle around init_memory_region for stolen (Matt) - Typo fixes (wengjianfeng) [airlied: fix conflict with fixes in i915_active.c] Signed-off-by: Dave Airlie <[email protected]> From: Joonas Lahtinen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/YLCbBR22BsQ/[email protected]
2 parents 43ed3c6 + 5b26d57 commit ccd1950

File tree

90 files changed

+2007
-568
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

90 files changed

+2007
-568
lines changed

Documentation/gpu/driver-uapi.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
===============
2+
DRM Driver uAPI
3+
===============
4+
5+
drm/i915 uAPI
6+
=============
7+
8+
.. kernel-doc:: include/uapi/drm/i915_drm.h

Documentation/gpu/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Linux GPU Driver Developer's Guide
1010
drm-kms
1111
drm-kms-helpers
1212
drm-uapi
13+
driver-uapi
1314
drm-client
1415
drivers
1516
backlight
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
=========================
2+
I915 DG1/LMEM RFC Section
3+
=========================
4+
5+
Upstream plan
6+
=============
7+
For upstream the overall plan for landing all the DG1 stuff and turning it for
8+
real, with all the uAPI bits is:
9+
10+
* Merge basic HW enabling of DG1(still without pciid)
11+
* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
12+
* At this point we can still make changes, but importantly this lets us
13+
start running IGTs which can utilize local-memory in CI
14+
* Convert over to TTM, make sure it all keeps working. Some of the work items:
15+
* TTM shrinker for discrete
16+
* dma_resv_lockitem for full dma_resv_lock, i.e not just trylock
17+
* Use TTM CPU pagefault handler
18+
* Route shmem backend over to TTM SYSTEM for discrete
19+
* TTM purgeable object support
20+
* Move i915 buddy allocator over to TTM
21+
* MMAP ioctl mode(see `I915 MMAP`_)
22+
* SET/GET ioctl caching(see `I915 SET/GET CACHING`_)
23+
* Send RFC(with mesa-dev on cc) for final sign off on the uAPI
24+
* Add pciid for DG1 and turn on uAPI for real
25+
26+
New object placement and region query uAPI
27+
==========================================
28+
Starting from DG1 we need to give userspace the ability to allocate buffers from
29+
device local-memory. Currently the driver supports gem_create, which can place
30+
buffers in system memory via shmem, and the usual assortment of other
31+
interfaces, like dumb buffers and userptr.
32+
33+
To support this new capability, while also providing a uAPI which will work
34+
beyond just DG1, we propose to offer three new bits of uAPI:
35+
36+
DRM_I915_QUERY_MEMORY_REGIONS
37+
-----------------------------
38+
New query ID which allows userspace to discover the list of supported memory
39+
regions(like system-memory and local-memory) for a given device. We identify
40+
each region with a class and instance pair, which should be unique. The class
41+
here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
42+
like DG1.
43+
44+
Side note: The class/instance design is borrowed from our existing engine uAPI,
45+
where we describe every physical engine in terms of its class, and the
46+
particular instance, since we can have more than one per class.
47+
48+
In the future we also want to expose more information which can further
49+
describe the capabilities of a region.
50+
51+
.. kernel-doc:: include/uapi/drm/i915_drm.h
52+
:functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
53+
54+
GEM_CREATE_EXT
55+
--------------
56+
New ioctl which is basically just gem_create but now allows userspace to provide
57+
a chain of possible extensions. Note that if we don't provide any extensions and
58+
set flags=0 then we get the exact same behaviour as gem_create.
59+
60+
Side note: We also need to support PXP[1] in the near future, which is also
61+
applicable to integrated platforms, and adds its own gem_create_ext extension,
62+
which basically lets userspace mark a buffer as "protected".
63+
64+
.. kernel-doc:: include/uapi/drm/i915_drm.h
65+
:functions: drm_i915_gem_create_ext
66+
67+
I915_GEM_CREATE_EXT_MEMORY_REGIONS
68+
----------------------------------
69+
Implemented as an extension for gem_create_ext, we would now allow userspace to
70+
optionally provide an immutable list of preferred placements at creation time,
71+
in priority order, for a given buffer object. For the placements we expect
72+
them each to use the class/instance encoding, as per the output of the regions
73+
query. Having the list in priority order will be useful in the future when
74+
placing an object, say during eviction.
75+
76+
.. kernel-doc:: include/uapi/drm/i915_drm.h
77+
:functions: drm_i915_gem_create_ext_memory_regions
78+
79+
One fair criticism here is that this seems a little over-engineered[2]. If we
80+
just consider DG1 then yes, a simple gem_create.flags or something is totally
81+
all that's needed to tell the kernel to allocate the buffer in local-memory or
82+
whatever. However looking to the future we need uAPI which can also support
83+
upcoming Xe HP multi-tile architecture in a sane way, where there can be
84+
multiple local-memory instances for a given device, and so using both class and
85+
instance in our uAPI to describe regions is desirable, although specifically
86+
for DG1 it's uninteresting, since we only have a single local-memory instance.
87+
88+
Existing uAPI issues
89+
====================
90+
Some potential issues we still need to resolve.
91+
92+
I915 MMAP
93+
---------
94+
In i915 there are multiple ways to MMAP GEM object, including mapping the same
95+
object using different mapping types(WC vs WB), i.e multiple active mmaps per
96+
object. TTM expects one MMAP at most for the lifetime of the object. If it
97+
turns out that we have to backpedal here, there might be some potential
98+
userspace fallout.
99+
100+
I915 SET/GET CACHING
101+
--------------------
102+
In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
103+
DG1 doesn't support non-snooped pcie transactions, so we can just always
104+
allocate as WB for smem-only buffers. If/when our hw gains support for
105+
non-snooped pcie transactions then we must fix this mode at allocation time as
106+
a new GEM extension.
107+
108+
This is related to the mmap problem, because in general (meaning, when we're
109+
not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
110+
allocation mode.
111+
112+
Possible idea is to let the kernel picks the mmap mode for userspace from the
113+
following table:
114+
115+
smem-only: WB. Userspace does not need to call clflush.
116+
117+
smem+lmem: We only ever allow a single mode, so simply allocate this as uncached
118+
memory, and always give userspace a WC mapping. GPU still does snooped access
119+
here(assuming we can't turn it off like on DG1), which is a bit inefficient.
120+
121+
lmem only: always WC
122+
123+
This means on discrete you only get a single mmap mode, all others must be
124+
rejected. That's probably going to be a new default mode or something like
125+
that.
126+
127+
Links
128+
=====
129+
[1] https://patchwork.freedesktop.org/series/86798/
130+
131+
[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791

Documentation/gpu/rfc/index.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,7 @@ host such documentation:
1515

1616
* Once the code has landed move all the documentation to the right places in
1717
the main core, helper or driver sections.
18+
19+
.. toctree::
20+
21+
i915_gem_lmem.rst

drivers/gpu/drm/i915/display/intel_display.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11660,11 +11660,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
1166011660
struct drm_framebuffer *fb;
1166111661
struct drm_i915_gem_object *obj;
1166211662
struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
11663+
struct drm_i915_private *i915;
1166311664

1166411665
obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
1166511666
if (!obj)
1166611667
return ERR_PTR(-ENOENT);
1166711668

11669+
/* object is backed with LMEM for discrete */
11670+
i915 = to_i915(obj->base.dev);
11671+
if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
11672+
/* object is "remote", not in local memory */
11673+
i915_gem_object_put(obj);
11674+
return ERR_PTR(-EREMOTE);
11675+
}
11676+
1166811677
fb = intel_framebuffer_create(obj, &mode_cmd);
1166911678
i915_gem_object_put(obj);
1167011679

drivers/gpu/drm/i915/display/intel_fbdev.c

Lines changed: 38 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@
4141
#include <drm/drm_fb_helper.h>
4242
#include <drm/drm_fourcc.h>
4343

44+
#include "gem/i915_gem_lmem.h"
45+
4446
#include "i915_drv.h"
4547
#include "intel_display_types.h"
4648
#include "intel_fbdev.h"
@@ -137,14 +139,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
137139
size = mode_cmd.pitches[0] * mode_cmd.height;
138140
size = PAGE_ALIGN(size);
139141

140-
/* If the FB is too big, just don't use it since fbdev is not very
141-
* important and we should probably use that space with FBC or other
142-
* features. */
143142
obj = ERR_PTR(-ENODEV);
144-
if (size * 2 < dev_priv->stolen_usable_size)
145-
obj = i915_gem_object_create_stolen(dev_priv, size);
146-
if (IS_ERR(obj))
147-
obj = i915_gem_object_create_shmem(dev_priv, size);
143+
if (HAS_LMEM(dev_priv)) {
144+
obj = i915_gem_object_create_lmem(dev_priv, size,
145+
I915_BO_ALLOC_CONTIGUOUS);
146+
} else {
147+
/*
148+
* If the FB is too big, just don't use it since fbdev is not very
149+
* important and we should probably use that space with FBC or other
150+
* features.
151+
*/
152+
if (size * 2 < dev_priv->stolen_usable_size)
153+
obj = i915_gem_object_create_stolen(dev_priv, size);
154+
if (IS_ERR(obj))
155+
obj = i915_gem_object_create_shmem(dev_priv, size);
156+
}
157+
148158
if (IS_ERR(obj)) {
149159
drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
150160
return PTR_ERR(obj);
@@ -178,6 +188,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
178188
unsigned long flags = 0;
179189
bool prealloc = false;
180190
void __iomem *vaddr;
191+
struct drm_i915_gem_object *obj;
181192
int ret;
182193

183194
if (intel_fb &&
@@ -232,13 +243,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
232243
info->fbops = &intelfb_ops;
233244

234245
/* setup aperture base/size for vesafb takeover */
235-
info->apertures->ranges[0].base = ggtt->gmadr.start;
236-
info->apertures->ranges[0].size = ggtt->mappable_end;
246+
obj = intel_fb_obj(&intel_fb->base);
247+
if (i915_gem_object_is_lmem(obj)) {
248+
struct intel_memory_region *mem = obj->mm.region;
249+
250+
info->apertures->ranges[0].base = mem->io_start;
251+
info->apertures->ranges[0].size = mem->total;
252+
253+
/* Use fbdev's framebuffer from lmem for discrete */
254+
info->fix.smem_start =
255+
(unsigned long)(mem->io_start +
256+
i915_gem_object_get_dma_address(obj, 0));
257+
info->fix.smem_len = obj->base.size;
258+
} else {
259+
info->apertures->ranges[0].base = ggtt->gmadr.start;
260+
info->apertures->ranges[0].size = ggtt->mappable_end;
237261

238-
/* Our framebuffer is the entirety of fbdev's system memory */
239-
info->fix.smem_start =
240-
(unsigned long)(ggtt->gmadr.start + vma->node.start);
241-
info->fix.smem_len = vma->node.size;
262+
/* Our framebuffer is the entirety of fbdev's system memory */
263+
info->fix.smem_start =
264+
(unsigned long)(ggtt->gmadr.start + vma->node.start);
265+
info->fix.smem_len = vma->node.size;
266+
}
242267

243268
vaddr = i915_vma_pin_iomap(vma);
244269
if (IS_ERR(vaddr)) {

drivers/gpu/drm/i915/display/intel_frontbuffer.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,6 @@ static int frontbuffer_active(struct i915_active *ref)
211211
return 0;
212212
}
213213

214-
__i915_active_call
215214
static void frontbuffer_retire(struct i915_active *ref)
216215
{
217216
struct intel_frontbuffer *front =
@@ -266,7 +265,8 @@ intel_frontbuffer_get(struct drm_i915_gem_object *obj)
266265
atomic_set(&front->bits, 0);
267266
i915_active_init(&front->write,
268267
frontbuffer_active,
269-
i915_active_may_sleep(frontbuffer_retire));
268+
frontbuffer_retire,
269+
I915_ACTIVE_RETIRE_SLEEPS);
270270

271271
spin_lock(&i915->fb_tracking.lock);
272272
if (rcu_access_pointer(obj->frontbuffer)) {

drivers/gpu/drm/i915/display/intel_overlay.c

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -384,8 +384,7 @@ static void intel_overlay_off_tail(struct intel_overlay *overlay)
384384
i830_overlay_clock_gating(dev_priv, true);
385385
}
386386

387-
__i915_active_call static void
388-
intel_overlay_last_flip_retire(struct i915_active *active)
387+
static void intel_overlay_last_flip_retire(struct i915_active *active)
389388
{
390389
struct intel_overlay *overlay =
391390
container_of(active, typeof(*overlay), last_flip);
@@ -1402,7 +1401,7 @@ void intel_overlay_setup(struct drm_i915_private *dev_priv)
14021401
overlay->saturation = 146;
14031402

14041403
i915_active_init(&overlay->last_flip,
1405-
NULL, intel_overlay_last_flip_retire);
1404+
NULL, intel_overlay_last_flip_retire, 0);
14061405

14071406
ret = get_registers(overlay, OVERLAY_NEEDS_PHYSICAL(dev_priv));
14081407
if (ret)

drivers/gpu/drm/i915/gem/i915_gem_context.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1046,7 +1046,6 @@ struct context_barrier_task {
10461046
void *data;
10471047
};
10481048

1049-
__i915_active_call
10501049
static void cb_retire(struct i915_active *base)
10511050
{
10521051
struct context_barrier_task *cb = container_of(base, typeof(*cb), base);
@@ -1080,7 +1079,7 @@ static int context_barrier_task(struct i915_gem_context *ctx,
10801079
if (!cb)
10811080
return -ENOMEM;
10821081

1083-
i915_active_init(&cb->base, NULL, cb_retire);
1082+
i915_active_init(&cb->base, NULL, cb_retire, 0);
10841083
err = i915_active_acquire(&cb->base);
10851084
if (err) {
10861085
kfree(cb);

0 commit comments

Comments
 (0)