Commit 1140b13
committed
Honor MAP_SHARED coherence across fork
Both fork paths (CoW shm and legacy IPC byte-copy) silently broke
MAP_SHARED visibility across fork: the child mapped the slab MAP_PRIVATE
or got a fresh byte copy, so writes from either side stayed local and
never reached the kernel page cache the parent shared with the file.
MAP_SHARED|MAP_ANONYMOUS, the standard parent-child IPC primitive used
by Postgres and other multi-process daemons, was equally broken.
Three pieces close the gap:
1. Parent-side conversion (mmap_fork_prepare_anon_shared, with
commit/abort wrappers). While siblings are quiesced the fork
thread walks live regions, promotes each MAP_SHARED|MAP_ANONYMOUS
region without a backing fd into a memfd-style overlay
(mkstemp+unlink+ftruncate, pwrite-seed from host_base, host
MAP_FIXED|MAP_SHARED via the new hvf_apply_file_overlay_quiesced
helper, mark_overlay_metadata_range), and pre-stages per-region
dup() fds so a transient EMFILE rolls back cleanly. The candidate
filter skips regions whose host-page-rounded tail would alias a
neighbor mapping. The transactional commit/abort wrappers let the
fork-IPC failure path roll back the in-place conversion (overlay
teardown plus region metadata restore) before resuming siblings;
abort validates every captured snapshot before tearing down so a
sibling-drift past the quiesce timeout does not leave host VA out
of sync with semantic state. forkipc.c logs a warning when abort
returns a partial failure so the parent's stale state is visible
in post-mortem.
2. Child-side restoration (mmap_fork_restore_overlays). The recv
path now snapshots parent overlay_active/start/end (and a new
parent_had_fd[] mirror) before clearing inherited state, then
re-runs hvf_apply_file_overlay against the saved overlay span
once SCM_RIGHTS delivers the backing fds. The inner quiesce is a
no-op since no worker vCPUs exist yet.
3. Pre-existing fork-IPC alignment bug. The old recv_backing_fds
filter (!MAP_ANONYMOUS && offset != -1) matched the shim region
(LINUX_MAP_PRIVATE, offset 0) and ELF text segments and silently
stole incoming SCM_RIGHTS fds, leaving the actual file-backed
regions with backing_fd=-1. The receiver now uses parent_had_fd[]
as the filter so its iteration order matches the sender's
"backing_fd >= 0" filter exactly. Unassigned fds are closed
instead of leaked.
hvf_apply_file_overlay and hvf_remove_file_overlay are split into a
public variant that handles thread_quiesce_siblings and a _quiesced
inner that the parent fork-prep / abort paths call without a nested
barrier.
Locked in by tests/test-cross-fork-mapshared.c (3 cases: file-backed
mkstemp, MAP_SHARED|MAP_ANONYMOUS, /dev/shm via shm_open). Each case
verifies pre-fork seed visibility, child-write-visible-to-parent,
parent-write-visible-to-child, and on-disk reconciliation. All three
pass against Linux ground truth via tests/qemu-runner.sh.1 parent c5ccb22 commit 1140b13
6 files changed
Lines changed: 1202 additions & 137 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
494 | 495 | | |
495 | 496 | | |
496 | 497 | | |
497 | | - | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
498 | 501 | | |
499 | 502 | | |
500 | 503 | | |
| |||
518 | 521 | | |
519 | 522 | | |
520 | 523 | | |
521 | | - | |
522 | | - | |
523 | | - | |
524 | | - | |
525 | | - | |
526 | | - | |
527 | | - | |
528 | | - | |
529 | | - | |
530 | | - | |
531 | | - | |
532 | | - | |
533 | | - | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
534 | 577 | | |
535 | 578 | | |
536 | 579 | | |
| |||
618 | 661 | | |
619 | 662 | | |
620 | 663 | | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
621 | 699 | | |
622 | 700 | | |
623 | | - | |
624 | | - | |
625 | | - | |
626 | | - | |
627 | | - | |
628 | | - | |
629 | | - | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
630 | 705 | | |
631 | 706 | | |
632 | 707 | | |
633 | 708 | | |
634 | 709 | | |
635 | 710 | | |
636 | | - | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
637 | 716 | | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
638 | 731 | | |
639 | 732 | | |
640 | 733 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
89 | 90 | | |
90 | 91 | | |
91 | 92 | | |
92 | | - | |
93 | | - | |
| 93 | + | |
| 94 | + | |
94 | 95 | | |
95 | 96 | | |
96 | 97 | | |
| |||
176 | 177 | | |
177 | 178 | | |
178 | 179 | | |
| 180 | + | |
179 | 181 | | |
180 | 182 | | |
181 | 183 | | |
| |||
218 | 220 | | |
219 | 221 | | |
220 | 222 | | |
221 | | - | |
222 | | - | |
| 223 | + | |
| 224 | + | |
223 | 225 | | |
224 | 226 | | |
225 | 227 | | |
226 | 228 | | |
227 | 229 | | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
232 | 234 | | |
233 | 235 | | |
234 | 236 | | |
| |||
246 | 248 | | |
247 | 249 | | |
248 | 250 | | |
249 | | - | |
250 | | - | |
| 251 | + | |
| 252 | + | |
251 | 253 | | |
252 | 254 | | |
253 | 255 | | |
254 | 256 | | |
255 | | - | |
256 | | - | |
| 257 | + | |
| 258 | + | |
257 | 259 | | |
258 | 260 | | |
259 | 261 | | |
| |||
921 | 923 | | |
922 | 924 | | |
923 | 925 | | |
| 926 | + | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
| 937 | + | |
| 938 | + | |
| 939 | + | |
| 940 | + | |
| 941 | + | |
924 | 942 | | |
925 | 943 | | |
926 | 944 | | |
| |||
947 | 965 | | |
948 | 966 | | |
949 | 967 | | |
950 | | - | |
951 | | - | |
952 | 968 | | |
953 | 969 | | |
954 | 970 | | |
| |||
1064 | 1080 | | |
1065 | 1081 | | |
1066 | 1082 | | |
1067 | | - | |
1068 | | - | |
1069 | | - | |
| 1083 | + | |
1070 | 1084 | | |
1071 | 1085 | | |
1072 | 1086 | | |
1073 | 1087 | | |
1074 | 1088 | | |
1075 | 1089 | | |
1076 | 1090 | | |
1077 | | - | |
1078 | | - | |
1079 | | - | |
1080 | | - | |
1081 | | - | |
1082 | 1091 | | |
1083 | 1092 | | |
1084 | 1093 | | |
1085 | | - | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
1086 | 1102 | | |
1087 | 1103 | | |
1088 | 1104 | | |
| |||
1112 | 1128 | | |
1113 | 1129 | | |
1114 | 1130 | | |
1115 | | - | |
1116 | | - | |
1117 | | - | |
1118 | | - | |
1119 | | - | |
1120 | 1131 | | |
1121 | 1132 | | |
| 1133 | + | |
| 1134 | + | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
| 1142 | + | |
| 1143 | + | |
| 1144 | + | |
| 1145 | + | |
1122 | 1146 | | |
1123 | 1147 | | |
1124 | 1148 | | |
| |||
0 commit comments