Skip to content

Commit b574425

Browse files
[BOLT] DataAggregator supports binaries with multiple text segments
When a binary has multiple text segments, the Size is computed as the difference of the last address of these segments from the BaseAddress. The base addresses of all text segments must be the same. Background: Larger binaries get two text segments mapped when loaded in memory. BOLT processes only the first, which is not having a correct BaseAddress, causing a wrong computation of a BinaryMMapInfo's size. Consequently, BOLT wrongly thinks that many of the samples fall outside the binary and ignores them. As a result, when used in heatmaps the output excludes all those entries and the section hotness statistics are wrong. This bug is present in both the AArch64 and x86 backends. --- This patch introduces the flag 'perf-script-events' that allows passing perf events without BOLT having to parse them using 'perf script'. The flag is used to pass a mock perf profile that has two memory mappings for a mock binary that has two text segments. The size of the mapping is updated as `parseMMapEvents` now processes all text segments. --- Example used in unit tests: From `/proc/<BINARY PID>/maps`, we have 2 text mappings, say A and B. ``` abc0000000-abc1000000 r-xp 011c0000 103:01 1573523 BINARY abc2000000-abca000000 r-xp 031d0000 103:01 1573523 BINARY ``` Size of text mappings: | Mapping | Size | | ------- | ------ | | A | ~15MB | | B | ~135MB | --- Example on a real program: ``` 2f7200000-2fabca000 r--p 00000000 bolted-binary 2fabd9000-2fe47c000 r-xp 039c9000 bolted-binary <- 1st txt segment 2fe48b000-2fe61d000 r--p 0727b000 bolted-binary 2fe62c000-2fe660000 rw-p 0740c000 bolted-binary 2fe660000-2fea4c000 rw-p 00000000 2fec00000-303dad000 r-xp 07a00000 bolted-binary <- 2nd (appears only on the bolted binary) ```
1 parent 4bdbb44 commit b574425

File tree

1 file changed

+8
-11
lines changed

1 file changed

+8
-11
lines changed

bolt/lib/Profile/DataAggregator.cpp

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2069,15 +2069,6 @@ std::error_code DataAggregator::parseMMapEvents() {
20692069
if (FileMMapInfo.first == "(deleted)")
20702070
continue;
20712071

2072-
// Consider only the first mapping of the file for any given PID
2073-
auto Range = GlobalMMapInfo.equal_range(FileMMapInfo.first);
2074-
bool PIDExists = llvm::any_of(make_range(Range), [&](const auto &MI) {
2075-
return MI.second.PID == FileMMapInfo.second.PID;
2076-
});
2077-
2078-
if (PIDExists)
2079-
continue;
2080-
20812072
GlobalMMapInfo.insert(FileMMapInfo);
20822073
}
20832074

@@ -2129,12 +2120,18 @@ std::error_code DataAggregator::parseMMapEvents() {
21292120
<< " using file offset 0x" << Twine::utohexstr(MMapInfo.Offset)
21302121
<< ". Ignoring profile data for this mapping\n";
21312122
continue;
2132-
} else {
2133-
MMapInfo.BaseAddress = *BaseAddress;
21342123
}
2124+
MMapInfo.BaseAddress = *BaseAddress;
21352125
}
21362126

2127+
// Try to add MMapInfo to the map and update its size. Large binaries
2128+
// may span multiple text segments, so the mapping is inserted only on the
2129+
// first occurrence. If a larger section size is found, it will be updated.
21372130
BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo));
2131+
uint64_t EndAddress = MMapInfo.MMapAddress + MMapInfo.Size;
2132+
uint64_t Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress;
2133+
if (Size > BinaryMMapInfo[MMapInfo.PID].Size)
2134+
BinaryMMapInfo[MMapInfo.PID].Size = Size;
21382135
}
21392136

21402137
if (BinaryMMapInfo.empty()) {

0 commit comments

Comments
 (0)