-
Notifications
You must be signed in to change notification settings - Fork 15.2k
wasm-ld: Implement function pointer alignment #105532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-wasm Author: Ethan O'Brien (ethanaobrien) ChangesWIP implementation of function pointer alignment. Currently all values are statically declared.
Previous discussion: emscripten-core/emscripten#22369 Full diff: https://github.com/llvm/llvm-project/pull/105532.diff 1 Files Affected:
diff --git a/lld/wasm/SyntheticSections.cpp b/lld/wasm/SyntheticSections.cpp
index f02f55519a2512..13333a67378681 100644
--- a/lld/wasm/SyntheticSections.cpp
+++ b/lld/wasm/SyntheticSections.cpp
@@ -563,8 +563,24 @@ void ElemSection::addEntry(FunctionSymbol *sym) {
// They only exist so that the calls to missing functions can validate.
if (sym->hasTableIndex() || sym->isStub)
return;
+
+ uint32_t padding = 0;
+ uint64_t alignment = 1;
+
+ if (indirectFunctions.size() == 0 && padding > 0) {
+ for (uint32_t i=0; i<padding; i++) {
+ indirectFunctions.push_back(nullptr);
+ }
+ }
+
sym->setTableIndex(config->tableBase + indirectFunctions.size());
indirectFunctions.emplace_back(sym);
+
+ if (alignment > 1) {
+ for (uint32_t i=0; i<alignment-1; i++) {
+ indirectFunctions.push_back(nullptr);
+ }
+ }
}
void ElemSection::writeBody() {
@@ -602,6 +618,12 @@ void ElemSection::writeBody() {
writeUleb128(os, indirectFunctions.size(), "elem count");
uint32_t tableIndex = config->tableBase;
for (const FunctionSymbol *sym : indirectFunctions) {
+ if (sym == nullptr) {
+ (void) tableIndex;
+ writeUleb128(os, 0, "function index");
+ ++tableIndex;
+ continue;
+ }
assert(sym->getTableIndex() == tableIndex);
(void) tableIndex;
writeUleb128(os, sym->getFunctionIndex(), "function index");
|
|
@sbc100 Could I get some help cleaning this PR up and adding CLI arguments to influence these variables? I've tried a few times the last few months but don't seem to be familiar enough with the codebase to know how. |
To add new command line flags you would edit |
|
Thanks! Before I do this, do you have opinions on what the flags should be? From what I've been able to tell, to mimic how |
|
Since we are implementing this as a linker flag and not a compiler flag I don't think there any need to mimik the existing compiler flags. I think simply I don't think need a separate padding flag since we already have |
|
Thanks! This has been added, and is now ready for review. Should this option also be added as an emscripten option? It seems it already was once upon a time. https://github.com/emscripten-core/emscripten/blob/b94d6e6ced33576a1ad96c7f6e701788c6f49211/src/settings.js#L2209C5-L2209C31 |
|
Merge this! :D |
|
Hey @sbc100 is there any info on getting this reviewed? Thanks |
|
bump |
|
Sorry I had some comments that were "Pending" since for a long time. |
You can test this locally with the following command:git-clang-format --diff 9a95c097d0466c594f40a4ba9ced8a155574fdff 50aa21781adc97bad015d1fab5ad3415dfefcd58 --extensions cpp,h -- lld/wasm/Config.h lld/wasm/Driver.cpp lld/wasm/SyntheticSections.cppView the diff from clang-format here.diff --git a/lld/wasm/Driver.cpp b/lld/wasm/Driver.cpp
index d8fd2b7820..4a0e923868 100644
--- a/lld/wasm/Driver.cpp
+++ b/lld/wasm/Driver.cpp
@@ -636,7 +636,8 @@ static void readConfigs(opt::InputArgList &args) {
LLVM_DEBUG(errorHandler().verbose = true);
ctx.arg.tableBase = args::getInteger(args, OPT_table_base, 0);
- ctx.arg.functionPointerAlignment = args::getInteger(args, OPT_function_pointer_alignment, 0);
+ ctx.arg.functionPointerAlignment =
+ args::getInteger(args, OPT_function_pointer_alignment, 0);
ctx.arg.globalBase = args::getInteger(args, OPT_global_base, 0);
ctx.arg.initialHeap = args::getInteger(args, OPT_initial_heap, 0);
ctx.arg.initialMemory = args::getInteger(args, OPT_initial_memory, 0);
diff --git a/lld/wasm/SyntheticSections.cpp b/lld/wasm/SyntheticSections.cpp
index 3fcef7cb3d..5ce0c5cbb4 100644
--- a/lld/wasm/SyntheticSections.cpp
+++ b/lld/wasm/SyntheticSections.cpp
@@ -570,7 +570,7 @@ void ElemSection::addEntry(FunctionSymbol *sym) {
indirectFunctions.emplace_back(sym);
if (ctx.arg.functionPointerAlignment > 1) {
- for (uint32_t i=0; i<ctx.arg.functionPointerAlignment-1; i++) {
+ for (uint32_t i = 0; i < ctx.arg.functionPointerAlignment - 1; i++) {
indirectFunctions.push_back(nullptr);
}
}
@@ -611,7 +611,7 @@ void ElemSection::writeBody() {
uint32_t tableIndex = ctx.arg.tableBase;
for (const FunctionSymbol *sym : indirectFunctions) {
if (sym == nullptr) {
- (void) tableIndex;
+ (void)tableIndex;
writeUleb128(os, 0, "function index");
++tableIndex;
continue;
|
|
All good @sbc100, I've also had that happen a few times too many 😄. This PR should be good to go now, with requested changes |
|
Hey @sbc100, is there anything this is waiting on? |
sbc100
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. I think we still need a test for this new option. lgtm once we have a test.
|
May be a silly question - could you link where the emscripten tests are (so I can have something to base it off of and know where to put it). I'm unfamiliar with how this repo is organized |
|
The wasm-ld tests live in |
|
Hey @sbc100! Sorry for the delay, I finally got the time to figure out how to write a test. I've added tests for default, an alignmnet of 2, and an alignment of 3 |
|
Hey @sbc100, do you have any advise on what other alignment I'm missing to get this working? |
What exactly are you asking here? I looks like the current patch is causing a crash? I would try to reproduce it using a simple example, and build with debug info so you can see where the crash is happening. If your example is simple enough you it should be possible so simply decompile it and see where the bad function pointer is. |
It would appear adjusting the first function of the table to be aligned rather than at index zero causes the crash, likely something thinks the entry function isnt where it actually is from my findings? |
It shouldn't matter where any given function is in the table, and there is nothing special about the any entry or first function (all functions pointers are just table indexes). |
|
I would just build a simple example that makes a single indirect call to an empty function. It should be obivous from the disassembly if its address in the table is wrong. |
It appears I was incorrect. You're right, main still fires how it should. The issue is with apparently stdout somehow. I have this test program #include <stdio.h>
int main() {
printf("Potato?\n");
return 0;
}Compiled with: And the error appears to be tied to the call to print (or specifically, this line) Or similar happens if attempting to write data to a file: What I dont understand is why changing the index of the first function broke this? |
|
I would try with an even simpler program: You should be able to see the constant value that main returns and verify it is correct. |
Yep, this is correct |
|
@sbc100 - I do want to try and finally get this finished and merged, though I don't know where to go. Things like writing to a file/stdout seem to be completely broken when changing the alignment of the first function and I don't know why or enough of this program to debug why. The small code snippet you shared involving no io works fine, though io seems completely broken. What's your opinion on this? Would always aligning at index 0 be an issue? I'm not sure what to do from here |
|
I've rolled back to the last working change. It seems there's something else that doesnt like aligning the index of the first function, as per previous comments |
That doesn't seems reasonable though does it? Having an option that aligns all function pointers except the first one seems very odd. I think we/you need to figure out what is going on a fix the issue such that all function pointers are aligned, not just some. |
|
Yes, I agree. I'm just not sure what else to do and don't know enough how the codebase works to debug this deeper It seems to happen when writing to stdout or writing a file, |
|
Can you also share the main function you uses in the reproducer. Almost certainly this is related to relocations since I believe the function pointer that are part of the standard streams are setup statically which means writing relocations to the data section that link time. See these three lines: https://github.com/emscripten-core/emscripten/blob/9110edd6366ae9f40fb91b1317f18d41df9df317/system/lib/libc/musl/src/stdio/stderr.c#L12-L14 My guess is you should be able to reproduce without stdio with a simple static data structure like the one listed above. It should be fairly easy to confirm statically using a small example if the linker has written the correct values (for functions pointers) into the data section. |





WIP implementation of function pointer alignment. Currently all values are statically declared.
padding: Waitsxvalues before beginning the first function pointer. This can be useful for users needing to align on even numbers, or on odd numbers. This should possibly be equal totableBaseby default?alignment: Insertsx-1nullptrvalues into the function table.Previous discussion: emscripten-core/emscripten#22369