-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[Bolt] Fix address translation for KASLR kernel #114261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This patch enables Bolt to analyze kernel addresses that have been randomized by KASLR. It parses memory map (MMap) entries within perf files to find the address mapping.
|
@llvm/pr-subscribers-bolt Author: None (xur-llvm) ChangesThis patch enables Bolt to analyze kernel addresses that have been randomized by KASLR. It parses memory map (MMap) entries within perf files to find the address mapping. Full diff: https://github.com/llvm/llvm-project/pull/114261.diff 3 Files Affected:
diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index f246750209d6c4..e23bda097543d6 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -2024,6 +2024,10 @@ BinaryContext::getBaseAddressForMapping(uint64_t MMapAddress,
// Only consider executable segments.
if (!SegInfo.IsExecutable)
continue;
+ // For Linux kernel perf files, SegInfo.FileOffset and FileOffset are
+ // irrelvent.
+ if (IsLinuxKernel)
+ return MMapAddress - SegInfo.Address;
// FileOffset is got from perf event,
// and it is equal to alignDown(SegInfo.FileOffset, pagesize).
// If the pagesize is not equal to SegInfo.Alignment.
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index ffd693f9bbaed4..b31d661bed7562 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -530,26 +530,18 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) {
};
if (BC.IsLinuxKernel) {
- // Current MMap parsing logic does not work with linux kernel.
- // MMap entries for linux kernel uses PERF_RECORD_MMAP
- // format instead of typical PERF_RECORD_MMAP2 format.
- // Since linux kernel address mapping is absolute (same as
- // in the ELF file), we avoid parsing MMap in linux kernel mode.
- // While generating optimized linux kernel binary, we may need
- // to parse MMap entries.
-
// In linux kernel mode, we analyze and optimize
// all linux kernel binary instructions, irrespective
// of whether they are due to system calls or due to
// interrupts. Therefore, we cannot ignore interrupt
// in Linux kernel mode.
opts::IgnoreInterruptLBR = false;
- } else {
- prepareToParse("mmap events", MMapEventsPPI, ErrorCallback);
- if (parseMMapEvents())
- errs() << "PERF2BOLT: failed to parse mmap events\n";
}
+ prepareToParse("mmap events", MMapEventsPPI, ErrorCallback);
+ if (parseMMapEvents())
+ errs() << "PERF2BOLT: failed to parse mmap events\n";
+
prepareToParse("task events", TaskEventsPPI, ErrorCallback);
if (parseTaskEvents())
errs() << "PERF2BOLT: failed to parse task events\n";
@@ -1102,6 +1094,11 @@ ErrorOr<DataAggregator::PerfBranchSample> DataAggregator::parseBranchSample() {
return make_error_code(errc::no_such_process);
}
+ if (BC->IsLinuxKernel) {
+ // "-1" is the pid for the Linux kernel
+ MMapInfoIter = BinaryMMapInfo.find(-1);
+ }
+
while (checkAndConsumeFS()) {
}
@@ -1936,7 +1933,8 @@ DataAggregator::parseMMapEvent() {
}
StringRef Line = ParsingBuf.substr(0, LineEnd);
- size_t Pos = Line.find("PERF_RECORD_MMAP2");
+ // This would match both PERF_RECORD_MMAP and PERF_RECORD_MMAP2
+ size_t Pos = Line.find("PERF_RECORD_MMAP");
if (Pos == StringRef::npos) {
consumeRestOfLine();
return std::make_pair(StringRef(), ParsedInfo);
@@ -1944,6 +1942,9 @@ DataAggregator::parseMMapEvent() {
// Line:
// {<name> .* <sec>.<usec>: }PERF_RECORD_MMAP2 <pid>/<tid>: .* <file_name>
+ // Or:
+ // {<name> .* <sec>.<usec>: }PERF_RECORD_MMAP <-1 | pid>/<tid>: .*
+ // <file_name>
const StringRef TimeStr =
Line.substr(0, Pos).rsplit(':').first.rsplit(FieldSeparator).second;
@@ -1954,9 +1955,14 @@ DataAggregator::parseMMapEvent() {
// Line:
// PERF_RECORD_MMAP2 <pid>/<tid>: [<hexbase>(<hexsize>) .*]: .* <file_name>
+ // Or:
+ // PERF_RECORD_MMAP <-1 | pid>/<tid>: [<hexbase>(<hexsize>) .*]: .*
+ // <file_name>
StringRef FileName = Line.rsplit(FieldSeparator).second;
- if (FileName.starts_with("//") || FileName.starts_with("[")) {
+ if (FileName == "[kernel.kallsyms]_text")
+ FileName = "[kernel.kallsyms]";
+ else if (FileName.starts_with("//") || FileName.starts_with("[")) {
consumeRestOfLine();
return std::make_pair(StringRef(), ParsedInfo);
}
@@ -1983,8 +1989,11 @@ DataAggregator::parseMMapEvent() {
return make_error_code(llvm::errc::io_error);
}
- const StringRef OffsetStr =
- Line.split('@').second.ltrim().split(FieldSeparator).first;
+ const StringRef OffsetStr = Line.split('@')
+ .second.ltrim()
+ .split(FieldSeparator)
+ .first.split(']')
+ .first;
if (OffsetStr.getAsInteger(0, ParsedInfo.Offset)) {
reportError("expected mmaped page-aligned offset");
Diag << "Found: " << OffsetStr << "in '" << Line << "'\n";
@@ -2008,7 +2017,8 @@ std::error_code DataAggregator::parseMMapEvents() {
return EC;
std::pair<StringRef, MMapInfo> FileMMapInfo = FileMMapInfoRes.get();
- if (FileMMapInfo.second.PID == -1)
+ if (FileMMapInfo.first != "[kernel.kallsyms]" &&
+ FileMMapInfo.second.PID == -1)
continue;
if (FileMMapInfo.first == "(deleted)")
continue;
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 32ec7abe8b666a..ee88f04b5504da 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -530,8 +530,11 @@ Error RewriteInstance::discoverStorage() {
Phdr.p_vaddr, Phdr.p_memsz, Phdr.p_offset,
Phdr.p_filesz, Phdr.p_align, ((Phdr.p_flags & ELF::PF_X) != 0)};
if (BC->TheTriple->getArch() == llvm::Triple::x86_64 &&
- Phdr.p_vaddr >= BinaryContext::KernelStartX86_64)
+ Phdr.p_vaddr >= BinaryContext::KernelStartX86_64) {
BC->IsLinuxKernel = true;
+ BC->HasFixedLoadAddress = false;
+ }
+
break;
case ELF::PT_INTERP:
BC->HasInterpHeader = true;
@@ -995,8 +998,13 @@ void RewriteInstance::discoverFileObjects() {
}
if (!Section->isText()) {
- assert(SymbolType != SymbolRef::ST_Function &&
- "unexpected function inside non-code section");
+ // In kernel, a function can live in a non-text section. For Example,
+ // lkdtm_rodata_do_nothing() in ./drivers/misc/lkdtm/rodata.c is in
+ // the rodata section.
+ if (!BC->IsLinuxKernel) {
+ assert(SymbolType != SymbolRef::ST_Function &&
+ "unexpected function inside non-code section");
+ }
LLVM_DEBUG(dbgs() << "BOLT-DEBUG: rejecting as symbol is not in code\n");
registerName(SymbolSize);
continue;
|
|
I haven't checked yet, but this PR should fix #99373. |
|
Thank you for the patch, Ron (@xur-llvm). I verified that it works for |
|
What is the symptom of the breakage? Can you share the perf file and the
vmlinux so that I can debug?
I just tried my patch on 6.12.0-rc6-1 for non-KASLR profile and it
generated exactly the same bolt profile using perf2bolt without my patch.
This is on arch-linux.
…-Rong
On Sun, Nov 10, 2024 at 7:34 PM Maksim Panchenko ***@***.***> wrote:
Thank you for the patch, Ron ***@***.*** <https://github.com/xur-llvm>). I
verified that it works for vmlinux with KASLR. Sadly, at the same time it
breaks non-KASLR vmlinux processing for perf2bolt.
—
Reply to this email directly, view it on GitHub
<#114261 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOI42XTLY4HDHJ25LEZ3WAD2AAQUPAVCNFSM6AAAAABQ4Q6YGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRXGE3DINJZGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
On your system you must have matched |
| Phdr.p_vaddr >= BinaryContext::KernelStartX86_64) | ||
| Phdr.p_vaddr >= BinaryContext::KernelStartX86_64) { | ||
| BC->IsLinuxKernel = true; | ||
| BC->HasFixedLoadAddress = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to detect KASLR by looking at vmlinux ELF and only then set HasFixedLoadAddress.
|
OK. I see. This should be easy to fix: the logic of detecting kernel is not
using the name. So if this is kernel, we can skip the match and use the
provided ELF file directly.
Or, we can pass a flat for kernel? (I don't think you would like this).
…-Rong
On Mon, Nov 11, 2024 at 9:16 PM Maksim Panchenko ***@***.***> wrote:
What is the symptom of the breakage? Can you share the perf file and the
vmlinux so that I can debug? I just tried my patch on 6.12.0-rc6-1 for
non-KASLR profile and it generated exactly the same bolt profile using
perf2bolt without my patch.
On your system you must have matched vmlinux binary by build-id. When
build-id doesn't match or doesn't exist, with this patch perf2bolt will
try to match the kernel binary by name. In perf.data it's recorded as
"[kernel.kallsyms]" hence the error from perf2bolt. The workaround is to
rename "vmlinux" to "[kernel.kallsyms]" which I'd rather avoid since unlike
in the user space there's no ambiguity about the kernel binary.
—
Reply to this email directly, view it on GitHub
<#114261 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOI42XS7H4FGILEIM76442L2AGFLJAVCNFSM6AAAAABQ4Q6YGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRZGYZDQOBWGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
When I worked on this patch, I had a question on why
HasFixedLoadAddress is needed. To me, ASLR (KASLR), or not, should use
one code path. Non-ASLR (to me) is a trivial case of ASLR -- at least
from parsing and reading perf file point of view.
I was thinking about removing this field. But this field also used to
control some optimizations. So I leave it there.
It seems to me the only upside of using FixedLoadAddress is to avoid
address adjustments -- but that should be very cheap.
Is my understanding correct?
Thanks,
…-Rong
On Mon, Nov 11, 2024 at 9:20 PM Maksim Panchenko ***@***.***> wrote:
@maksfb commented on this pull request.
________________________________
In bolt/lib/Rewrite/RewriteInstance.cpp:
> BC->IsLinuxKernel = true;
+ BC->HasFixedLoadAddress = false;
Is there any way to detect KASLR by looking at vmlinux ELF and only then set HasFixedLoadAddress.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
One example where we use |
Sound like a plan. No flag is needed. You currently check for "[kernel.kallsyms]" in the |
Use a fixed name for the kernel image to process kernel profiles, regardless of the presence of a build ID or not. This is to address the issue when the provided kernel image lacks a matching build ID. This name, "[kernel.kallsyms]", is the default for kernel DSOs in the Linux kernel source code (see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/dso.c#n428). While "[guest.kernel.kallsyms]" is the kernel DSO name for guest kernel, support for VM profiles is currently limited. Therefore, we can skip this name for now.
|
@maksfb: Maksim, could you tell the updated patch to see if it works for the kernel image without build-id? |
Thanks. I'm back from vacation. Will test it. |
This patch enables Bolt to analyze kernel addresses that have been randomized by KASLR. It parses memory map (MMap) entries within perf files to find the address mapping. Note that that kernel modules are still not handled.