Skip to content

Commit 3032894

Browse files
prattmicgopherbot
authored andcommitted
runtime: make explicit nil check in heapSetTypeSmallHeader
This is another case very similar to CL 684015 and golang#74375. In spans with type headers, mallocgc always writes to the page before returning the allocated memory. This initial write is done by runtime.heapSetTypeSmallHeader. Prior to the write, the compiler inserts a nil check, implemented as a dummy instruction reading from memory. On a freshly mapped page, this read triggers a page fault, mapping the zero page read-only. Immediately afterwards, the write triggers another page fault, copying to a writeable page and performing a TLB flush. This problem is exacerbated as the process scales up. At GOMAXPROCS=6, the tile38 sweet benchmark spends around 0.1% of cycles directly handling these page faults. On the same machine at GOMAXPROCS=192, it spends about 2.7% of cycles directly handling these page faults. Replacing the read with an explicit nil check reduces the direct cost of these page faults down to around 0.1% at GOMAXPROCS=192. There are additional positive side-effects due to reduced contention, so the overall time spent in page faults drops from around 12.8% to 6.8%. Most of the remaining time in page faults is spent on automatic NUMA page migration (completely unrelated to this issue). Impact on the tile38 benchmark results: │ baseline │ cl704755 │ │ sec/op │ sec/op vs base │ Tile38QueryLoad-192 1.638m ± 3% 1.494m ± 5% -8.79% (p=0.002 n=6) │ baseline │ cl704755 │ │ average-RSS-bytes │ average-RSS-bytes vs base │ Tile38QueryLoad-192 5.384Gi ± 3% 5.399Gi ± 3% ~ (p=0.818 n=6) │ baseline │ cl704755 │ │ peak-RSS-bytes │ peak-RSS-bytes vs base │ Tile38QueryLoad-192 5.818Gi ± 1% 5.864Gi ± 2% ~ (p=0.394 n=6) │ baseline │ cl704755 │ │ peak-VM-bytes │ peak-VM-bytes vs base │ Tile38QueryLoad-192 7.121Gi ± 1% 7.180Gi ± 2% ~ (p=0.818 n=6) │ baseline │ cl704755 │ │ p50-latency-sec │ p50-latency-sec vs base │ Tile38QueryLoad-192 343.2µ ± 1% 313.2µ ± 3% -8.73% (p=0.002 n=6) │ baseline │ cl704755 │ │ p90-latency-sec │ p90-latency-sec vs base │ Tile38QueryLoad-192 1.662m ± 2% 1.603m ± 5% ~ (p=0.093 n=6) │ baseline │ cl704755 │ │ p99-latency-sec │ p99-latency-sec vs base │ Tile38QueryLoad-192 41.56m ± 8% 35.26m ± 18% -15.17% (p=0.026 n=6) │ baseline │ cl704755 │ │ ops/s │ ops/s vs base │ Tile38QueryLoad-192 87.89k ± 3% 96.36k ± 4% +9.64% (p=0.002 n=6) Updates golang#74375. Change-Id: I6a6a636c1a16261b6d5076f2e1b08524a6544d33 Reviewed-on: https://go-review.googlesource.com/c/go/+/704755 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Michael Knyszek <[email protected]> Auto-Submit: Michael Pratt <[email protected]>
1 parent ef05b66 commit 3032894

File tree

1 file changed

+20
-0
lines changed

1 file changed

+20
-0
lines changed

src/runtime/mbitmap.go

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -714,6 +714,26 @@ func heapSetTypeNoHeader(x, dataSize uintptr, typ *_type, span *mspan) uintptr {
714714
}
715715

716716
func heapSetTypeSmallHeader(x, dataSize uintptr, typ *_type, header **_type, span *mspan) uintptr {
717+
if header == nil {
718+
// This nil check and throw is almost pointless. Normally we would
719+
// expect header to never be nil. However, this is called on potentially
720+
// freshly-allocated virtual memory. As of 2025, the compiler-inserted
721+
// nil check is not a branch but a memory read that we expect to fault
722+
// if the pointer really is nil.
723+
//
724+
// However, this causes a read of the page, and operating systems may
725+
// take it as a hint to back the accessed memory with a read-only zero
726+
// page. However, we immediately write to this memory, which can then
727+
// force operating systems to have to update the page table and flush
728+
// the TLB.
729+
//
730+
// This nil check is thus an explicit branch instead of what the compiler
731+
// would insert circa 2025, which is a memory read instruction.
732+
//
733+
// See go.dev/issue/74375 for details of a similar issue in
734+
// spanInlineMarkBits.
735+
throw("runtime: pointer to heap type header nil?")
736+
}
717737
*header = typ
718738
if doubleCheckHeapSetType {
719739
doubleCheckHeapType(x, dataSize, typ, header, span)

0 commit comments

Comments
 (0)