Skip to content

Commit 829f3b9

Browse files
committed
Merge tag 'pstore-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore updates from Kees Cook: "Fixes and new features for pstore. This is a pretty big set of changes (relative to past pstore pulls), but it has been in -next for a while. The biggest change here is the ability to support a block device as a pstore backend, which has been desired for a while. A lot of additional fixes and refactorings are also included, mostly in support of the new features. - refactor pstore locking for safer module unloading (Kees Cook) - remove orphaned records from pstorefs when backend unloaded (Kees Cook) - refactor dump_oops parameter into max_reason (Pavel Tatashin) - introduce pstore/zone for common code for contiguous storage (WeiXiong Liao) - introduce pstore/blk for block device backend (WeiXiong Liao) - introduce mtd backend (WeiXiong Liao)" * tag 'pstore-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (35 commits) mtd: Support kmsg dumper based on pstore/blk pstore/blk: Introduce "best_effort" mode pstore/blk: Support non-block storage devices pstore/blk: Provide way to query pstore configuration pstore/zone: Provide way to skip "broken" zone for MTD devices Documentation: Add details for pstore/blk pstore/zone,blk: Add ftrace frontend support pstore/zone,blk: Add console frontend support pstore/zone,blk: Add support for pmsg frontend pstore/blk: Introduce backend for block devices pstore/zone: Introduce common layer to manage storage zones ramoops: Add "max-reason" optional field to ramoops DT node pstore/ram: Introduce max_reason and convert dump_oops pstore/platform: Pass max_reason to kmesg dump printk: Introduce kmsg_dump_reason_str() printk: honor the max_reason field in kmsg_dumper printk: Collapse shutdown types into a single dump reason pstore/ftrace: Provide ftrace log merging routine pstore/ram: Refactor ftrace buffer merging pstore/ram: Refactor DT size parsing ...
2 parents 81e8c10 + 78c0824 commit 829f3b9

File tree

26 files changed

+3464
-206
lines changed

26 files changed

+3464
-206
lines changed
Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
pstore block oops/panic logger
4+
==============================
5+
6+
Introduction
7+
------------
8+
9+
pstore block (pstore/blk) is an oops/panic logger that writes its logs to a
10+
block device and non-block device before the system crashes. You can get
11+
these log files by mounting pstore filesystem like::
12+
13+
mount -t pstore pstore /sys/fs/pstore
14+
15+
16+
pstore block concepts
17+
---------------------
18+
19+
pstore/blk provides efficient configuration method for pstore/blk, which
20+
divides all configurations into two parts, configurations for user and
21+
configurations for driver.
22+
23+
Configurations for user determine how pstore/blk works, such as pmsg_size,
24+
kmsg_size and so on. All of them support both Kconfig and module parameters,
25+
but module parameters have priority over Kconfig.
26+
27+
Configurations for driver are all about block device and non-block device,
28+
such as total_size of block device and read/write operations.
29+
30+
Configurations for user
31+
-----------------------
32+
33+
All of these configurations support both Kconfig and module parameters, but
34+
module parameters have priority over Kconfig.
35+
36+
Here is an example for module parameters::
37+
38+
pstore_blk.blkdev=179:7 pstore_blk.kmsg_size=64
39+
40+
The detail of each configurations may be of interest to you.
41+
42+
blkdev
43+
~~~~~~
44+
45+
The block device to use. Most of the time, it is a partition of block device.
46+
It's required for pstore/blk. It is also used for MTD device.
47+
48+
It accepts the following variants for block device:
49+
50+
1. <hex_major><hex_minor> device number in hexadecimal represents itself; no
51+
leading 0x, for example b302.
52+
#. /dev/<disk_name> represents the device number of disk
53+
#. /dev/<disk_name><decimal> represents the device number of partition - device
54+
number of disk plus the partition number
55+
#. /dev/<disk_name>p<decimal> - same as the above; this form is used when disk
56+
name of partitioned disk ends with a digit.
57+
#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF represents the unique id of
58+
a partition if the partition table provides it. The UUID may be either an
59+
EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
60+
where SSSSSSSS is a zero-filled hex representation of the 32-bit
61+
"NT disk signature", and PP is a zero-filled hex representation of the
62+
1-based partition number.
63+
#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
64+
partition with a known unique id.
65+
#. <major>:<minor> major and minor number of the device separated by a colon.
66+
67+
It accepts the following variants for MTD device:
68+
69+
1. <device name> MTD device name. "pstore" is recommended.
70+
#. <device number> MTD device number.
71+
72+
kmsg_size
73+
~~~~~~~~~
74+
75+
The chunk size in KB for oops/panic front-end. It **MUST** be a multiple of 4.
76+
It's optional if you do not care oops/panic log.
77+
78+
There are multiple chunks for oops/panic front-end depending on the remaining
79+
space except other pstore front-ends.
80+
81+
pstore/blk will log to oops/panic chunks one by one, and always overwrite the
82+
oldest chunk if there is no more free chunk.
83+
84+
pmsg_size
85+
~~~~~~~~~
86+
87+
The chunk size in KB for pmsg front-end. It **MUST** be a multiple of 4.
88+
It's optional if you do not care pmsg log.
89+
90+
Unlike oops/panic front-end, there is only one chunk for pmsg front-end.
91+
92+
Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
93+
appended to the chunk. On reboot the contents are available in
94+
*/sys/fs/pstore/pmsg-pstore-blk-0*.
95+
96+
console_size
97+
~~~~~~~~~~~~
98+
99+
The chunk size in KB for console front-end. It **MUST** be a multiple of 4.
100+
It's optional if you do not care console log.
101+
102+
Similar to pmsg front-end, there is only one chunk for console front-end.
103+
104+
All log of console will be appended to the chunk. On reboot the contents are
105+
available in */sys/fs/pstore/console-pstore-blk-0*.
106+
107+
ftrace_size
108+
~~~~~~~~~~~
109+
110+
The chunk size in KB for ftrace front-end. It **MUST** be a multiple of 4.
111+
It's optional if you do not care console log.
112+
113+
Similar to oops front-end, there are multiple chunks for ftrace front-end
114+
depending on the count of cpu processors. Each chunk size is equal to
115+
ftrace_size / processors_count.
116+
117+
All log of ftrace will be appended to the chunk. On reboot the contents are
118+
combined and available in */sys/fs/pstore/ftrace-pstore-blk-0*.
119+
120+
Persistent function tracing might be useful for debugging software or hardware
121+
related hangs. Here is an example of usage::
122+
123+
# mount -t pstore pstore /sys/fs/pstore
124+
# mount -t debugfs debugfs /sys/kernel/debug/
125+
# echo 1 > /sys/kernel/debug/pstore/record_ftrace
126+
# reboot -f
127+
[...]
128+
# mount -t pstore pstore /sys/fs/pstore
129+
# tail /sys/fs/pstore/ftrace-pstore-blk-0
130+
CPU:0 ts:5914676 c0063828 c0063b94 call_cpuidle <- cpu_startup_entry+0x1b8/0x1e0
131+
CPU:0 ts:5914678 c039ecdc c006385c cpuidle_enter_state <- call_cpuidle+0x44/0x48
132+
CPU:0 ts:5914680 c039e9a0 c039ecf0 cpuidle_enter_freeze <- cpuidle_enter_state+0x304/0x314
133+
CPU:0 ts:5914681 c0063870 c039ea30 sched_idle_set_state <- cpuidle_enter_state+0x44/0x314
134+
CPU:1 ts:5916720 c0160f59 c015ee04 kernfs_unmap_bin_file <- __kernfs_remove+0x140/0x204
135+
CPU:1 ts:5916721 c05ca625 c015ee0c __mutex_lock_slowpath <- __kernfs_remove+0x148/0x204
136+
CPU:1 ts:5916723 c05c813d c05ca630 yield_to <- __mutex_lock_slowpath+0x314/0x358
137+
CPU:1 ts:5916724 c05ca2d1 c05ca638 __ww_mutex_lock <- __mutex_lock_slowpath+0x31c/0x358
138+
139+
max_reason
140+
~~~~~~~~~~
141+
142+
Limiting which kinds of kmsg dumps are stored can be controlled via
143+
the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
144+
``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
145+
``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
146+
``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
147+
(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
148+
``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
149+
otherwise KMSG_DUMP_MAX.
150+
151+
Configurations for driver
152+
-------------------------
153+
154+
Only a block device driver cares about these configurations. A block device
155+
driver uses ``register_pstore_blk`` to register to pstore/blk.
156+
157+
.. kernel-doc:: fs/pstore/blk.c
158+
:identifiers: register_pstore_blk
159+
160+
A non-block device driver uses ``register_pstore_device`` with
161+
``struct pstore_device_info`` to register to pstore/blk.
162+
163+
.. kernel-doc:: fs/pstore/blk.c
164+
:identifiers: register_pstore_device
165+
166+
.. kernel-doc:: include/linux/pstore_blk.h
167+
:identifiers: pstore_device_info
168+
169+
Compression and header
170+
----------------------
171+
172+
Block device is large enough for uncompressed oops data. Actually we do not
173+
recommend data compression because pstore/blk will insert some information into
174+
the first line of oops/panic data. For example::
175+
176+
Panic: Total 16 times
177+
178+
It means that it's OOPS|Panic for the 16th time since the first booting.
179+
Sometimes the number of occurrences of oops|panic since the first booting is
180+
important to judge whether the system is stable.
181+
182+
The following line is inserted by pstore filesystem. For example::
183+
184+
Oops#2 Part1
185+
186+
It means that it's OOPS for the 2nd time on the last boot.
187+
188+
Reading the data
189+
----------------
190+
191+
The dump data can be read from the pstore filesystem. The format for these
192+
files is ``dmesg-pstore-blk-[N]`` for oops/panic front-end,
193+
``pmsg-pstore-blk-0`` for pmsg front-end and so on. The timestamp of the
194+
dump file records the trigger time. To delete a stored record from block
195+
device, simply unlink the respective pstore file.
196+
197+
Attentions in panic read/write APIs
198+
-----------------------------------
199+
200+
If on panic, the kernel is not going to run for much longer, the tasks will not
201+
be scheduled and most kernel resources will be out of service. It
202+
looks like a single-threaded program running on a single-core computer.
203+
204+
The following points require special attention for panic read/write APIs:
205+
206+
1. Can **NOT** allocate any memory.
207+
If you need memory, just allocate while the block driver is initializing
208+
rather than waiting until the panic.
209+
#. Must be polled, **NOT** interrupt driven.
210+
No task schedule any more. The block driver should delay to ensure the write
211+
succeeds, but NOT sleep.
212+
#. Can **NOT** take any lock.
213+
There is no other task, nor any shared resource; you are safe to break all
214+
locks.
215+
#. Just use CPU to transfer.
216+
Do not use DMA to transfer unless you are sure that DMA will not keep lock.
217+
#. Control registers directly.
218+
Please control registers directly rather than use Linux kernel resources.
219+
Do I/O map while initializing rather than wait until a panic occurs.
220+
#. Reset your block device and controller if necessary.
221+
If you are not sure of the state of your block device and controller when
222+
a panic occurs, you are safe to stop and reset them.
223+
224+
pstore/blk supports psblk_blkdev_info(), which is defined in
225+
*linux/pstore_blk.h*, to get information of using block device, such as the
226+
device number, sector count and start sector of the whole disk.
227+
228+
pstore block internals
229+
----------------------
230+
231+
For developer reference, here are all the important structures and APIs:
232+
233+
.. kernel-doc:: fs/pstore/zone.c
234+
:internal:
235+
236+
.. kernel-doc:: include/linux/pstore_zone.h
237+
:internal:
238+
239+
.. kernel-doc:: fs/pstore/blk.c
240+
:export:
241+
242+
.. kernel-doc:: include/linux/pstore_blk.h
243+
:internal:

Documentation/admin-guide/ramoops.rst

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,11 +32,17 @@ memory to be mapped strongly ordered, and atomic operations on strongly ordered
3232
memory are implementation defined, and won't work on many ARMs such as omaps.
3333

3434
The memory area is divided into ``record_size`` chunks (also rounded down to
35-
power of two) and each oops/panic writes a ``record_size`` chunk of
35+
power of two) and each kmesg dump writes a ``record_size`` chunk of
3636
information.
3737

38-
Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
39-
variable while setting 0 in that variable dumps only the panics.
38+
Limiting which kinds of kmsg dumps are stored can be controlled via
39+
the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
40+
``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
41+
``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
42+
``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
43+
(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
44+
``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
45+
otherwise KMSG_DUMP_MAX.
4046

4147
The module uses a counter to record multiple dumps but the counter gets reset
4248
on restart (i.e. new dumps after the restart will overwrite old ones).
@@ -90,7 +96,7 @@ Setting the ramoops parameters can be done in several different manners:
9096
.mem_address = <...>,
9197
.mem_type = <...>,
9298
.record_size = <...>,
93-
.dump_oops = <...>,
99+
.max_reason = <...>,
94100
.ecc = <...>,
95101
};
96102

Documentation/devicetree/bindings/reserved-memory/ramoops.txt

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Optional properties:
3030
- ecc-size: enables ECC support and specifies ECC buffer size in bytes
3131
(defaults to 0: no ECC)
3232

33-
- record-size: maximum size in bytes of each dump done on oops/panic
33+
- record-size: maximum size in bytes of each kmsg dump.
3434
(defaults to 0: disabled)
3535

3636
- console-size: size in bytes of log buffer reserved for kernel messages
@@ -45,7 +45,16 @@ Optional properties:
4545
- unbuffered: if present, use unbuffered mappings to map the reserved region
4646
(defaults to buffered mappings)
4747

48-
- no-dump-oops: if present, only dump panics (defaults to panics and oops)
48+
- max-reason: if present, sets maximum type of kmsg dump reasons to store
49+
(defaults to 2: log Oopses and Panics). This can be set to INT_MAX to
50+
store all kmsg dumps. See include/linux/kmsg_dump.h KMSG_DUMP_* for other
51+
kmsg dump reason values. Setting this to 0 (KMSG_DUMP_UNDEF), means the
52+
reason filtering will be controlled by the printk.always_kmsg_dump boot
53+
param: if unset, it will be KMSG_DUMP_OOPS, otherwise KMSG_DUMP_MAX.
54+
55+
- no-dump-oops: deprecated, use max_reason instead. If present, and
56+
max_reason is not specified, it is equivalent to max_reason = 1
57+
(KMSG_DUMP_PANIC).
4958

5059
- flags: if present, pass ramoops behavioral flags (defaults to 0,
5160
see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).

MAINTAINERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13715,6 +13715,7 @@ M: Tony Luck <[email protected]>
1371513715
S: Maintained
1371613716
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/pstore
1371713717
F: Documentation/admin-guide/ramoops.rst
13718+
F: Documentation/admin-guide/pstore-blk.rst
1371813719
F: Documentation/devicetree/bindings/reserved-memory/ramoops.txt
1371913720
F: drivers/acpi/apei/erst.c
1372013721
F: drivers/firmware/efi/efi-pstore.c

arch/powerpc/kernel/nvram_64.c

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -655,9 +655,7 @@ static void oops_to_nvram(struct kmsg_dumper *dumper,
655655
int rc = -1;
656656

657657
switch (reason) {
658-
case KMSG_DUMP_RESTART:
659-
case KMSG_DUMP_HALT:
660-
case KMSG_DUMP_POWEROFF:
658+
case KMSG_DUMP_SHUTDOWN:
661659
/* These are almost always orderly shutdowns. */
662660
return;
663661
case KMSG_DUMP_OOPS:

drivers/mtd/Kconfig

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,16 @@ config MTD_OOPS
170170
buffer in a flash partition where it can be read back at some
171171
later point.
172172

173+
config MTD_PSTORE
174+
tristate "Log panic/oops to an MTD buffer based on pstore"
175+
depends on PSTORE_BLK
176+
help
177+
This enables panic and oops messages to be logged to a circular
178+
buffer in a flash partition where it can be read back as files after
179+
mounting pstore filesystem.
180+
181+
If unsure, say N.
182+
173183
config MTD_SWAP
174184
tristate "Swap on MTD device support"
175185
depends on MTD && SWAP

drivers/mtd/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ obj-$(CONFIG_RFD_FTL) += rfd_ftl.o
2020
obj-$(CONFIG_SSFDC) += ssfdc.o
2121
obj-$(CONFIG_SM_FTL) += sm_ftl.o
2222
obj-$(CONFIG_MTD_OOPS) += mtdoops.o
23+
obj-$(CONFIG_MTD_PSTORE) += mtdpstore.o
2324
obj-$(CONFIG_MTD_SWAP) += mtdswap.o
2425

2526
nftl-objs := nftlcore.o nftlmount.o

0 commit comments

Comments
 (0)