Skip to content

Commit d34a8a3

Browse files
committed
move doc of userland physical address tests to README
1 parent 6045b9f commit d34a8a3

File tree

7 files changed

+263
-148
lines changed

7 files changed

+263
-148
lines changed

README.adoc

Lines changed: 235 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4456,6 +4456,233 @@ Bibliography:
44564456
* https://stackoverflow.com/questions/39134990/mmap-of-dev-mem-fails-with-invalid-argument-for-virt-to-phys-address-but-addre/45127582#45127582
44574457
* https://stackoverflow.com/questions/43325205/can-we-use-virt-to-phys-for-user-space-memory-in-kernel-module
44584458

4459+
===== Userland physical address experiments
4460+
4461+
Only tested in x86_64.
4462+
4463+
The Linux kernel exposes physical addresses to userland through:
4464+
4465+
* `/proc/<pid>/maps`
4466+
* `/proc/<pid>/pagemap`
4467+
* `/dev/mem`
4468+
4469+
In this section we will play with them.
4470+
4471+
First get a virtual address to play with:
4472+
4473+
....
4474+
/virt_to_phys_test.out &
4475+
....
4476+
4477+
Source: link:kernel_module/user/virt_to_phys_test.c[]
4478+
4479+
Sample output:
4480+
4481+
....
4482+
vaddr 0x600800
4483+
pid 110
4484+
....
4485+
4486+
The program:
4487+
4488+
* allocates a `volatile` variable and sets is value to `0x12345678`
4489+
* prints the virtual address of the variable, and the program PID
4490+
* runs a while loop until until the value of the variable gets mysteriously changed somehow, e.g. by nasty tinkerers like us
4491+
4492+
Then, translate the virtual address to physical using `/proc/<pid>/maps` and `/proc/<pid>/pagemap`:
4493+
4494+
....
4495+
/virt_to_phys_user.out 110 0x600800
4496+
....
4497+
4498+
Sample output physical address:
4499+
4500+
....
4501+
0x7c7b800
4502+
....
4503+
4504+
Source: link:kernel_module/user/virt_to_phys_user.c[]
4505+
4506+
Now we can verify that `virt_to_phys_user.out` gave the correct physical address in the following ways:
4507+
4508+
* <<qemu-xp>>
4509+
* <<dev-mem>>
4510+
4511+
Bibliography:
4512+
4513+
* https://stackoverflow.com/questions/17021214/decode-proc-pid-pagemap-entry/45126141#45126141
4514+
* https://stackoverflow.com/questions/6284810/proc-pid-pagemaps-and-proc-pid-maps-linux/45500208#45500208
4515+
4516+
====== QEMU xp
4517+
4518+
The `xp` <<qemu-monitor>> command reads memory at a given physical address.
4519+
4520+
First launch `virt_to_phys_user.out` as described at <<userland-physical-address-experiments>>.
4521+
4522+
On a second terminal, use QEMU to read the physical address:
4523+
4524+
....
4525+
./qemumonitor 'xp 0x7c7b800'
4526+
....
4527+
4528+
Output:
4529+
4530+
....
4531+
0000000007c7b800: 0x12345678
4532+
....
4533+
4534+
Yes!!! We read the correct value from the physical address.
4535+
4536+
We could not find however to write to memory from the QEMU monitor, boring.
4537+
4538+
====== /dev/mem
4539+
4540+
`/dev/mem` exposes access to physical addresses, and we use it through the convenient `devmem` BusyBox utility.
4541+
4542+
First launch `virt_to_phys_user.out` as described at <<userland-physical-address-experiments>>.
4543+
4544+
Next, read from the physical address:
4545+
4546+
....
4547+
devmem 0x7c7b800
4548+
....
4549+
4550+
Possible output:
4551+
4552+
....
4553+
Memory mapped at address 0x7ff7dbe01000.
4554+
Value at address 0X7C7B800 (0x7ff7dbe01800): 0x12345678
4555+
....
4556+
4557+
which shows that the physical memory contains the expected value `0x12345678`.
4558+
4559+
`0x7ff7dbe01000` is a new virtual address that `devmem` maps to the physical address to be able to read from it.
4560+
4561+
Modify the physical memory:
4562+
4563+
....
4564+
devmem 0x7c7b800 w 0x9abcdef0
4565+
....
4566+
4567+
After one second, we see on the screen:
4568+
4569+
....
4570+
i 9abcdef0
4571+
[1]+ Done /virt_to_phys_test.out
4572+
....
4573+
4574+
so the value changed, and the `while` loop exited!
4575+
4576+
This example requires:
4577+
4578+
* `CONFIG_STRICT_DEVMEM=n`, otherwise `devmem` fails with:
4579+
+
4580+
....
4581+
devmem: mmap: Operation not permitted
4582+
....
4583+
* `nopat` kernel parameter
4584+
4585+
which we set by default.
4586+
4587+
Bibliography: https://stackoverflow.com/questions/11891979/how-to-access-mmaped-dev-mem-without-crashing-the-linux-kernel
4588+
4589+
====== pagemap_dump.out
4590+
4591+
Dump the physical address of all pages mapped to a given process using `/proc/<pid>/maps` and `/proc/<pid>/pagemap`.
4592+
4593+
First launch `virt_to_phys_user.out` as described at <<userland-physical-address-experiments>>. Suppose that the output was:
4594+
4595+
....
4596+
# /virt_to_phys_test.out &
4597+
vaddr 0x601048
4598+
pid 63
4599+
# /virt_to_phys_user.out 63 0x601048
4600+
0x1a61048
4601+
....
4602+
4603+
Now obtain the page map for the process:
4604+
4605+
....
4606+
/pagemap_dump.out 63
4607+
....
4608+
4609+
Sample output excerpt:
4610+
4611+
....
4612+
vaddr pfn soft-dirty file/shared swapped present library
4613+
400000 1ede 0 1 0 1 /virt_to_phys_test.out
4614+
600000 1a6f 0 0 0 1 /virt_to_phys_test.out
4615+
601000 1a61 0 0 0 1 /virt_to_phys_test.out
4616+
602000 2208 0 0 0 1 [heap]
4617+
603000 220b 0 0 0 1 [heap]
4618+
7ffff78ec000 1fd4 0 1 0 1 /lib/libuClibc-1.0.30.so
4619+
....
4620+
4621+
Source: link:kernel_module/user/pagemap_dump.c[]
4622+
4623+
Adapted from: https://github.com/dwks/pagemap/blob/8a25747bc79d6080c8b94eac80807a4dceeda57a/pagemap2.c
4624+
4625+
Meaning of the flags:
4626+
4627+
* `vaddr`: first virtual address of a page the belongs to the process. Notably:
4628+
+
4629+
....
4630+
./runtc readelf -l out/x86_64/buildroot/build/kernel_module-1.0/user/virt_to_phys_test.out
4631+
....
4632+
+
4633+
contains:
4634+
+
4635+
....
4636+
Type Offset VirtAddr PhysAddr
4637+
FileSiz MemSiz Flags Align
4638+
...
4639+
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
4640+
0x000000000000075c 0x000000000000075c R E 0x200000
4641+
LOAD 0x0000000000000e98 0x0000000000600e98 0x0000000000600e98
4642+
0x00000000000001b4 0x0000000000000218 RW 0x200000
4643+
4644+
Section to Segment mapping:
4645+
Segment Sections...
4646+
...
4647+
02 .interp .hash .dynsym .dynstr .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
4648+
03 .ctors .dtors .jcr .dynamic .got.plt .data .bss
4649+
....
4650+
+
4651+
from which we deduce that:
4652+
+
4653+
** `400000` is the text segment
4654+
** `600000` is the data segment
4655+
* `pfn`: add three zeroes to it, and you have the physical address.
4656+
+
4657+
Three zeroes is 12 bits which is 4kB, which is the size of a page.
4658+
+
4659+
For example, the virtual address `0x601000` has `pfn` of `0x1a61`, which means that its physical address is `0x1a61000`
4660+
+
4661+
This is consistent with what `virt_to_phys_user.out` told us: the virtual address `0x601048` has physical address `0x1a61048`.
4662+
+
4663+
`048` corresponds to the three last zeroes, and is the offset within the page.
4664+
+
4665+
Also, this value falls inside `0x601000`, which as previously analyzed is the data section, which is the normal location for global variables such as ours.
4666+
* `soft-dirty`: TODO
4667+
* `file/shared`: TODO. `1` seems to indicate that the page can be shared across processes, possibly for read-only pages? E.g. the text segment has `1`, but the data has `0`.
4668+
* `swapped`: TODO swapped to disk?
4669+
* `present`: TODO vs swapped?
4670+
* `library`: which executable owns that page
4671+
4672+
This program works in two steps:
4673+
4674+
* parse the human readable lines lines from `/proc/<pid>/maps`. This files contains lines of form:
4675+
+
4676+
....
4677+
7ffff7b6d000-7ffff7bdd000 r-xp 00000000 fe:00 658 /lib/libuClibc-1.0.22.so
4678+
....
4679+
+
4680+
which tells us that:
4681+
+
4682+
** `7f8af99f8000-7f8af99ff000` is a virtual address range that belong to the process, possibly containing multiple pages.
4683+
** `/lib/libuClibc-1.0.22.so` is the name of the library that owns that memory
4684+
* loop over each page of each address range, and ask `/proc/<pid>/pagemap` for more information about that page, including the physical address
4685+
44594686
=== Linux kernel tracing
44604687

44614688
Good overviews:
@@ -4880,6 +5107,12 @@ This example should handle interrupts from userland and print a message to stdou
48805107

48815108
TODO: what is the expected behaviour? I should have documented this when I wrote this stuff, and I'm that lazy right now that I'm in the middle of a refactor :-)
48825109

5110+
UIO interface in a nutshell:
5111+
5112+
* blocking read / poll: waits until interrupts
5113+
* `write`: call `irqcontrol` callback. Default: 0 or 1 to enable / disable interrupts.
5114+
* `mmap`: access device memory
5115+
48835116
Sources:
48845117

48855118
* link:kernel_module/user/uio_read.c[]
@@ -5805,7 +6038,7 @@ as:
58056038
Memory at feb54000
58066039
....
58076040

5808-
Then you can try messing with that address with:
6041+
Then you can try messing with that address with <<dev-mem>>:
58096042

58106043
....
58116044
devmem 0xfeb54000 w 0x12345678
@@ -6029,14 +6262,12 @@ Expected outcome after insmod:
60296262
* QEMU reports MMIO with printfs
60306263
* IRQs are generated and handled by this module, which logs to dmesg
60316264

6032-
Also without insmoding this module, try:
6265+
Without insmoding this module, try writing to the register with <<dev-mem>>:
60336266

60346267
....
60356268
devmem 0x101e9000 w 0x12345678
60366269
....
60376270

6038-
which touches the register from userland through `/dev/mem`.
6039-
60406271
We can also observe the interrupt with <<dummy-irq>>:
60416272

60426273
....

kernel_config_fragment/default

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
CONFIG_BLK_DEV_INITRD=y
2+
CONFIG_STRICT_DEVMEM=n
23
CONFIG_DYNAMIC_DEBUG=y
34
CONFIG_MODULE_SRCVERSION_ALL=y
45
CONFIG_OVERLAY_FS=y
@@ -101,19 +102,6 @@ CONFIG_X86_PTDUMP=y
101102

102103
## UIO
103104

104-
# Userspace drivers: allow you to handle IRQs and do memory IO from userland through a /dev file.
105-
#
106-
# Superseded by the more featureful VFIO.
107-
#
108-
# Documentation/DocBook/uio-howto.tmpl contains actual userland examples
109-
# for the generic examples under drivers/uio
110-
#
111-
# UIO interface in a nutshell:
112-
#
113-
# - blocking read / poll: waits until interrupts
114-
# - write: call irqcontrol callback. Default: 0 or 1 to enable / disable interrupts.
115-
# - mmap: access device memory
116-
117105
# All other UIO depend on this module.
118106
CONFIG_UIO=m
119107

kernel_module/user/README.adoc

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
11
https://github.com/cirosantilli/linux-kernel-module-cheat#rootfs_overlay
22

33
. link:sched_getaffinity.c[]
4-
. link:usermem.c[]
5-
.. link:pagemap_dump.c[]
6-
. link:uio_read.c[]

kernel_module/user/pagemap_dump.c

Lines changed: 5 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,4 @@
1-
/*
2-
Only tested in x86_64.
3-
4-
Adapted from: https://github.com/dwks/pagemap/blob/8a25747bc79d6080c8b94eac80807a4dceeda57a/pagemap2.c
5-
6-
- https://stackoverflow.com/questions/17021214/how-to-decode-proc-pid-pagemap-entries-in-linux/45126141#45126141
7-
- https://stackoverflow.com/questions/5748492/is-there-any-api-for-determining-the-physical-address-from-virtual-address-in-li
8-
- https://stackoverflow.com/questions/6284810/proc-pid-pagemaps-and-proc-pid-maps-linux/45500208#45500208
9-
10-
Dump the page map of a given process PID.
11-
12-
Data sources: /proc/PIC/{map,pagemap}
13-
14-
This program works in two steps:
15-
16-
- parse the human readable lines lines from `/proc/<pid>/maps`. This files contains lines of form:
17-
18-
7ffff7b6d000-7ffff7bdd000 r-xp 00000000 fe:00 658 /lib/libuClibc-1.0.22.so
19-
20-
which gives us:
21-
22-
- `7f8af99f8000-7f8af99ff000`: a virtual address range that belong to the process, possibly containing multiple pages.
23-
- `/lib/libuClibc-1.0.22.so` the name of the library that owns that memory.
24-
25-
- loop over each page of each address range, and ask `/proc/<pid>/pagemap` for more information about that page, including the physical address.
26-
*/
1+
/* https://github.com/cirosantilli/linux-kernel-module-cheat#pagemap_dump-out */
272

283
#define _XOPEN_SOURCE 700
294
#include <errno.h>
@@ -63,7 +38,7 @@ int main(int argc, char **argv)
6338
perror("open pagemap");
6439
return EXIT_FAILURE;
6540
}
66-
printf("addr pfn soft-dirty file/shared swapped present library\n");
41+
printf("vaddr pfn soft-dirty file/shared swapped present library\n");
6742
for (;;) {
6843
ssize_t length = read(maps_fd, buffer + offset, sizeof buffer - offset);
6944
if (length <= 0) break;
@@ -116,11 +91,11 @@ int main(int argc, char **argv)
11691
/* Get info about all pages in this page range with pagemap. */
11792
{
11893
PagemapEntry entry;
119-
for (uintptr_t addr = low; addr < high; addr += sysconf(_SC_PAGE_SIZE)) {
94+
for (uintptr_t vaddr = low; vaddr < high; vaddr += sysconf(_SC_PAGE_SIZE)) {
12095
/* TODO always fails for the last page (vsyscall), why? pread returns 0. */
121-
if (!pagemap_get_entry(&entry, pagemap_fd, addr)) {
96+
if (!pagemap_get_entry(&entry, pagemap_fd, vaddr)) {
12297
printf("%jx %jx %u %u %u %u %s\n",
123-
(uintmax_t)addr,
98+
(uintmax_t)vaddr,
12499
(uintmax_t)entry.pfn,
125100
entry.soft_dirty,
126101
entry.file_page,

0 commit comments

Comments
 (0)