Skip to content

Commit 61b8c39

Browse files
committed
Merge tag 'pm-runtime-6.17-rc1'
Runtime PM updates related to autosuspend for 6.17 Make several autosuspend functions mark last busy stamp and update the documentation accordingly (Sakari Ailus). Signed-off-by: Sebastian Reichel <[email protected]>
2 parents d375b70 + cd4da71 commit 61b8c39

File tree

325 files changed

+2569
-1438
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

325 files changed

+2569
-1438
lines changed

.mailmap

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -426,6 +426,9 @@ Krzysztof Wilczyński <[email protected]> <[email protected]>
426426
Krzysztof Wilczyński <[email protected]> <[email protected]>
427427
428428
Kuninori Morimoto <[email protected]>
429+
430+
431+
429432
430433
431434
@@ -719,6 +722,7 @@ Srinivas Ramana <[email protected]> <[email protected]>
719722
720723
721724
Stanislav Fomichev <[email protected]> <[email protected]>
725+
Stanislav Fomichev <[email protected]> <[email protected]>
722726
723727
Stéphane Witzmann <[email protected]>
724728

Documentation/admin-guide/cifs/usage.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -270,6 +270,8 @@ configured for Unix Extensions (and the client has not disabled
270270
illegal Windows/NTFS/SMB characters to a remap range (this mount parameter
271271
is the default for SMB3). This remap (``mapposix``) range is also
272272
compatible with Mac (and "Services for Mac" on some older Windows).
273+
When POSIX Extensions for SMB 3.1.1 are negotiated, remapping is automatically
274+
disabled.
273275

274276
CIFS VFS Mount Options
275277
======================

Documentation/block/ublk.rst

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -352,6 +352,83 @@ For reaching best IO performance, ublk server should align its segment
352352
parameter of `struct ublk_param_segment` with backend for avoiding
353353
unnecessary IO split, which usually hurts io_uring performance.
354354

355+
Auto Buffer Registration
356+
------------------------
357+
358+
The ``UBLK_F_AUTO_BUF_REG`` feature automatically handles buffer registration
359+
and unregistration for I/O requests, which simplifies the buffer management
360+
process and reduces overhead in the ublk server implementation.
361+
362+
This is another feature flag for using zero copy, and it is compatible with
363+
``UBLK_F_SUPPORT_ZERO_COPY``.
364+
365+
Feature Overview
366+
~~~~~~~~~~~~~~~~
367+
368+
This feature automatically registers request buffers to the io_uring context
369+
before delivering I/O commands to the ublk server and unregisters them when
370+
completing I/O commands. This eliminates the need for manual buffer
371+
registration/unregistration via ``UBLK_IO_REGISTER_IO_BUF`` and
372+
``UBLK_IO_UNREGISTER_IO_BUF`` commands, then IO handling in ublk server
373+
can avoid dependency on the two uring_cmd operations.
374+
375+
IOs can't be issued concurrently to io_uring if there is any dependency
376+
among these IOs. So this way not only simplifies ublk server implementation,
377+
but also makes concurrent IO handling becomes possible by removing the
378+
dependency on buffer registration & unregistration commands.
379+
380+
Usage Requirements
381+
~~~~~~~~~~~~~~~~~~
382+
383+
1. The ublk server must create a sparse buffer table on the same ``io_ring_ctx``
384+
used for ``UBLK_IO_FETCH_REQ`` and ``UBLK_IO_COMMIT_AND_FETCH_REQ``. If
385+
uring_cmd is issued on a different ``io_ring_ctx``, manual buffer
386+
unregistration is required.
387+
388+
2. Buffer registration data must be passed via uring_cmd's ``sqe->addr`` with the
389+
following structure::
390+
391+
struct ublk_auto_buf_reg {
392+
__u16 index; /* Buffer index for registration */
393+
__u8 flags; /* Registration flags */
394+
__u8 reserved0; /* Reserved for future use */
395+
__u32 reserved1; /* Reserved for future use */
396+
};
397+
398+
ublk_auto_buf_reg_to_sqe_addr() is for converting the above structure into
399+
``sqe->addr``.
400+
401+
3. All reserved fields in ``ublk_auto_buf_reg`` must be zeroed.
402+
403+
4. Optional flags can be passed via ``ublk_auto_buf_reg.flags``.
404+
405+
Fallback Behavior
406+
~~~~~~~~~~~~~~~~~
407+
408+
If auto buffer registration fails:
409+
410+
1. When ``UBLK_AUTO_BUF_REG_FALLBACK`` is enabled:
411+
412+
- The uring_cmd is completed
413+
- ``UBLK_IO_F_NEED_REG_BUF`` is set in ``ublksrv_io_desc.op_flags``
414+
- The ublk server must manually deal with the failure, such as, register
415+
the buffer manually, or using user copy feature for retrieving the data
416+
for handling ublk IO
417+
418+
2. If fallback is not enabled:
419+
420+
- The ublk I/O request fails silently
421+
- The uring_cmd won't be completed
422+
423+
Limitations
424+
~~~~~~~~~~~
425+
426+
- Requires same ``io_ring_ctx`` for all operations
427+
- May require manual buffer management in fallback cases
428+
- io_ring_ctx buffer table has a max size of 16K, which may not be enough
429+
in case that too many ublk devices are handled by this single io_ring_ctx
430+
and each one has very large queue depth
431+
355432
References
356433
==========
357434

Documentation/devicetree/bindings/pinctrl/starfive,jh7110-aon-pinctrl.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ description: |
1515
Some peripherals such as PWM have their I/O go through the 4 "GPIOs".
1616
1717
maintainers:
18-
- Jianlong Huang <jianlong.huang@starfivetech.com>
18+
- Hal Feng <hal.feng@starfivetech.com>
1919

2020
properties:
2121
compatible:

Documentation/devicetree/bindings/pinctrl/starfive,jh7110-sys-pinctrl.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ description: |
1818
any GPIO can be set up to be controlled by any of the peripherals.
1919
2020
maintainers:
21-
- Jianlong Huang <jianlong.huang@starfivetech.com>
21+
- Hal Feng <hal.feng@starfivetech.com>
2222

2323
properties:
2424
compatible:

Documentation/filesystems/proc.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -584,7 +584,6 @@ encoded manner. The codes are the following:
584584
ms may share
585585
gd stack segment growns down
586586
pf pure PFN range
587-
dw disabled write to the mapped file
588587
lo pages are locked in memory
589588
io memory mapped I/O area
590589
sr sequential read advise provided
@@ -607,8 +606,11 @@ encoded manner. The codes are the following:
607606
mt arm64 MTE allocation tags are enabled
608607
um userfaultfd missing tracking
609608
uw userfaultfd wr-protect tracking
609+
ui userfaultfd minor fault
610610
ss shadow/guarded control stack page
611611
sl sealed
612+
lf lock on fault pages
613+
dp always lazily freeable mapping
612614
== =======================================
613615

614616
Note that there is no guarantee that every flag and associated mnemonic will

Documentation/power/runtime_pm.rst

Lines changed: 23 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -154,11 +154,9 @@ suspending the device are satisfied) and to queue up a suspend request for the
154154
device in that case. If there is no idle callback, or if the callback returns
155155
0, then the PM core will attempt to carry out a runtime suspend of the device,
156156
also respecting devices configured for autosuspend. In essence this means a
157-
call to pm_runtime_autosuspend() (do note that drivers needs to update the
158-
device last busy mark, pm_runtime_mark_last_busy(), to control the delay under
159-
this circumstance). To prevent this (for example, if the callback routine has
160-
started a delayed suspend), the routine must return a non-zero value. Negative
161-
error return codes are ignored by the PM core.
157+
call to pm_runtime_autosuspend(). To prevent this (for example, if the callback
158+
routine has started a delayed suspend), the routine must return a non-zero
159+
value. Negative error return codes are ignored by the PM core.
162160

163161
The helper functions provided by the PM core, described in Section 4, guarantee
164162
that the following constraints are met with respect to runtime PM callbacks for
@@ -330,10 +328,9 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
330328
'power.disable_depth' is different from 0
331329

332330
`int pm_runtime_autosuspend(struct device *dev);`
333-
- same as pm_runtime_suspend() except that the autosuspend delay is taken
334-
`into account;` if pm_runtime_autosuspend_expiration() says the delay has
335-
not yet expired then an autosuspend is scheduled for the appropriate time
336-
and 0 is returned
331+
- same as pm_runtime_suspend() except that a call to
332+
pm_runtime_mark_last_busy() is made and an autosuspend is scheduled for
333+
the appropriate time and 0 is returned
337334

338335
`int pm_runtime_resume(struct device *dev);`
339336
- execute the subsystem-level resume callback for the device; returns 0 on
@@ -357,9 +354,9 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
357354
success or error code if the request has not been queued up
358355

359356
`int pm_request_autosuspend(struct device *dev);`
360-
- schedule the execution of the subsystem-level suspend callback for the
361-
device when the autosuspend delay has expired; if the delay has already
362-
expired then the work item is queued up immediately
357+
- Call pm_runtime_mark_last_busy() and schedule the execution of the
358+
subsystem-level suspend callback for the device when the autosuspend delay
359+
expires
363360

364361
`int pm_schedule_suspend(struct device *dev, unsigned int delay);`
365362
- schedule the execution of the subsystem-level suspend callback for the
@@ -411,8 +408,9 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
411408
pm_request_idle(dev) and return its result
412409

413410
`int pm_runtime_put_autosuspend(struct device *dev);`
414-
- does the same as __pm_runtime_put_autosuspend() for now, but in the
415-
future, will also call pm_runtime_mark_last_busy() as well, DO NOT USE!
411+
- set the power.last_busy field to the current time and decrement the
412+
device's usage counter; if the result is 0 then run
413+
pm_request_autosuspend(dev) and return its result
416414

417415
`int __pm_runtime_put_autosuspend(struct device *dev);`
418416
- decrement the device's usage counter; if the result is 0 then run
@@ -427,7 +425,8 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
427425
pm_runtime_suspend(dev) and return its result
428426

429427
`int pm_runtime_put_sync_autosuspend(struct device *dev);`
430-
- decrement the device's usage counter; if the result is 0 then run
428+
- set the power.last_busy field to the current time and decrement the
429+
device's usage counter; if the result is 0 then run
431430
pm_runtime_autosuspend(dev) and return its result
432431

433432
`void pm_runtime_enable(struct device *dev);`
@@ -870,11 +869,9 @@ device is automatically suspended (the subsystem or driver still has to call
870869
the appropriate PM routines); rather it means that runtime suspends will
871870
automatically be delayed until the desired period of inactivity has elapsed.
872871

873-
Inactivity is determined based on the power.last_busy field. Drivers should
874-
call pm_runtime_mark_last_busy() to update this field after carrying out I/O,
875-
typically just before calling __pm_runtime_put_autosuspend(). The desired
876-
length of the inactivity period is a matter of policy. Subsystems can set this
877-
length initially by calling pm_runtime_set_autosuspend_delay(), but after device
872+
Inactivity is determined based on the power.last_busy field. The desired length
873+
of the inactivity period is a matter of policy. Subsystems can set this length
874+
initially by calling pm_runtime_set_autosuspend_delay(), but after device
878875
registration the length should be controlled by user space, using the
879876
/sys/devices/.../power/autosuspend_delay_ms attribute.
880877

@@ -885,12 +882,13 @@ instead of the non-autosuspend counterparts::
885882

886883
Instead of: pm_runtime_suspend use: pm_runtime_autosuspend;
887884
Instead of: pm_schedule_suspend use: pm_request_autosuspend;
888-
Instead of: pm_runtime_put use: __pm_runtime_put_autosuspend;
885+
Instead of: pm_runtime_put use: pm_runtime_put_autosuspend;
889886
Instead of: pm_runtime_put_sync use: pm_runtime_put_sync_autosuspend.
890887

891888
Drivers may also continue to use the non-autosuspend helper functions; they
892889
will behave normally, which means sometimes taking the autosuspend delay into
893-
account (see pm_runtime_idle).
890+
account (see pm_runtime_idle). The autosuspend variants of the functions also
891+
call pm_runtime_mark_last_busy().
894892

895893
Under some circumstances a driver or subsystem may want to prevent a device
896894
from autosuspending immediately, even though the usage counter is zero and the
@@ -922,12 +920,10 @@ Here is a schematic pseudo-code example::
922920
foo_io_completion(struct foo_priv *foo, void *req)
923921
{
924922
lock(&foo->private_lock);
925-
if (--foo->num_pending_requests == 0) {
926-
pm_runtime_mark_last_busy(&foo->dev);
927-
__pm_runtime_put_autosuspend(&foo->dev);
928-
} else {
923+
if (--foo->num_pending_requests == 0)
924+
pm_runtime_put_autosuspend(&foo->dev);
925+
else
929926
foo_process_next_request(foo);
930-
}
931927
unlock(&foo->private_lock);
932928
/* Send req result back to the user ... */
933929
}

MAINTAINERS

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4555,6 +4555,7 @@ BPF [NETWORKING] (tcx & tc BPF, sock_addr)
45554555
M: Martin KaFai Lau <[email protected]>
45564556
M: Daniel Borkmann <[email protected]>
45574557
R: John Fastabend <[email protected]>
4558+
R: Stanislav Fomichev <[email protected]>
45584559
45594560
45604561
S: Maintained
@@ -6254,6 +6255,7 @@ F: include/linux/cpuhotplug.h
62546255
F: include/linux/smpboot.h
62556256
F: kernel/cpu.c
62566257
F: kernel/smpboot.*
6258+
F: rust/helper/cpu.c
62576259
F: rust/kernel/cpu.rs
62586260

62596261
CPU IDLE TIME MANAGEMENT FRAMEWORK
@@ -15919,6 +15921,7 @@ R: Liam R. Howlett <[email protected]>
1591915921
R: Nico Pache <[email protected]>
1592015922
R: Ryan Roberts <[email protected]>
1592115923
R: Dev Jain <[email protected]>
15924+
R: Barry Song <[email protected]>
1592215925
1592315926
S: Maintained
1592415927
W: http://www.linux-mm.org
@@ -17493,7 +17496,7 @@ F: tools/testing/selftests/net/srv6*
1749317496
NETWORKING [TCP]
1749417497
M: Eric Dumazet <[email protected]>
1749517498
M: Neal Cardwell <[email protected]>
17496-
R: Kuniyuki Iwashima <kuniyu@amazon.com>
17499+
R: Kuniyuki Iwashima <kuniyu@google.com>
1749717500
1749817501
S: Maintained
1749917502
F: Documentation/networking/net_cachelines/tcp_sock.rst
@@ -17523,7 +17526,7 @@ F: net/tls/*
1752317526

1752417527
NETWORKING [SOCKETS]
1752517528
M: Eric Dumazet <[email protected]>
17526-
M: Kuniyuki Iwashima <kuniyu@amazon.com>
17529+
M: Kuniyuki Iwashima <kuniyu@google.com>
1752717530
M: Paolo Abeni <[email protected]>
1752817531
M: Willem de Bruijn <[email protected]>
1752917532
S: Maintained
@@ -17538,7 +17541,7 @@ F: net/core/scm.c
1753817541
F: net/socket.c
1753917542

1754017543
NETWORKING [UNIX SOCKETS]
17541-
M: Kuniyuki Iwashima <kuniyu@amazon.com>
17544+
M: Kuniyuki Iwashima <kuniyu@google.com>
1754217545
S: Maintained
1754317546
F: include/net/af_unix.h
1754417547
F: include/net/netns/unix.h
@@ -23668,7 +23671,6 @@ F: include/dt-bindings/clock/starfive?jh71*.h
2366823671

2366923672
STARFIVE JH71X0 PINCTRL DRIVERS
2367023673
M: Emil Renner Berthing <[email protected]>
23671-
M: Jianlong Huang <[email protected]>
2367223674
M: Hal Feng <[email protected]>
2367323675
2367423676
S: Maintained
@@ -26974,6 +26976,7 @@ M: David S. Miller <[email protected]>
2697426976
M: Jakub Kicinski <[email protected]>
2697526977
M: Jesper Dangaard Brouer <[email protected]>
2697626978
M: John Fastabend <[email protected]>
26979+
R: Stanislav Fomichev <[email protected]>
2697726980
2697826981
2697926982
S: Supported
@@ -26995,6 +26998,7 @@ M: Björn Töpel <[email protected]>
2699526998
M: Magnus Karlsson <[email protected]>
2699626999
M: Maciej Fijalkowski <[email protected]>
2699727000
R: Jonathan Lemon <[email protected]>
27001+
R: Stanislav Fomichev <[email protected]>
2699827002
2699927003
2700027004
S: Maintained

Makefile

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
VERSION = 6
33
PATCHLEVEL = 16
44
SUBLEVEL = 0
5-
EXTRAVERSION = -rc1
5+
EXTRAVERSION = -rc2
66
NAME = Baby Opossum Posse
77

88
# *DOCUMENTATION*
@@ -1832,12 +1832,9 @@ rustfmtcheck: rustfmt
18321832
# Misc
18331833
# ---------------------------------------------------------------------------
18341834

1835-
# Run misc checks when ${KBUILD_EXTRA_WARN} contains 1
18361835
PHONY += misc-check
1837-
ifneq ($(findstring 1,$(KBUILD_EXTRA_WARN)),)
18381836
misc-check:
18391837
$(Q)$(srctree)/scripts/misc-check
1840-
endif
18411838

18421839
all: misc-check
18431840

arch/alpha/include/asm/pgtable.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -327,7 +327,7 @@ extern inline pte_t mk_swap_pte(unsigned long type, unsigned long offset)
327327
#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
328328
#define __swp_entry_to_pte(x) ((pte_t) { (x).val })
329329

330-
static inline int pte_swp_exclusive(pte_t pte)
330+
static inline bool pte_swp_exclusive(pte_t pte)
331331
{
332332
return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
333333
}

0 commit comments

Comments
 (0)