Skip to content
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
859ed81
Use GetExternalSymbolSymbol for MO_ExternalSymbol.
weiweichen Mar 28, 2025
5b74939
Merge branch 'main' of https://github.com/llvm/llvm-project into weiw…
weiweichen Mar 28, 2025
ec3e34e
Fix formatting.
weiweichen Mar 28, 2025
cf82df0
Make change more explicit.
weiweichen Mar 28, 2025
fe9668b
Fix English.
weiweichen Mar 28, 2025
a2433c9
Merge branch 'main' of https://github.com/llvm/llvm-project into weiw…
weiweichen Mar 28, 2025
e395289
Add unittest.
weiweichen Mar 29, 2025
e7dedc3
Merge branch 'main' of https://github.com/llvm/llvm-project into weiw…
weiweichen Mar 29, 2025
fc41f66
Fix M68k backend.
weiweichen Mar 29, 2025
bc9791b
Add more comments.
weiweichen Apr 3, 2025
e4321ce
Merge branch 'main' of https://github.com/llvm/llvm-project into weiw…
weiweichen Apr 3, 2025
bf64c40
Tidy up unittest according to review comment.
weiweichen Apr 4, 2025
44dd9e3
Try to use AsmPrinter.OutContext for Ctx in X86MCInstLower.
weiweichen Apr 4, 2025
14009cc
Merge branch 'main' of https://github.com/llvm/llvm-project into weiw…
weiweichen Apr 4, 2025
d504073
Apply change to M68K too.
weiweichen Apr 4, 2025
c3d31b2
Merge branch 'main' of https://github.com/llvm/llvm-project into weiw…
weiweichen Apr 4, 2025
b9dcf9a
Revert change in cmake file.
weiweichen Apr 4, 2025
1deea3b
Make unittest skip if X86 backend is not built.
weiweichen Apr 4, 2025
65f158f
Merge branch 'main' of https://github.com/llvm/llvm-project into weiw…
weiweichen Apr 4, 2025
44cc5a6
Remove commented out code.
weiweichen Apr 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions llvm/lib/Target/M68k/M68kMCInstLower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,12 @@ M68kMCInstLower::GetSymbolFromOperand(const MachineOperand &MO) const {
}

Name += Suffix;
if (!Sym)
Sym = Ctx.getOrCreateSymbol(Name);
if (!Sym) {
if (MO.isSymbol())
Sym = AsmPrinter.OutContext.getOrCreateSymbol(Name);
else
Sym = Ctx.getOrCreateSymbol(Name);
}

return Sym;
}
Expand Down
11 changes: 9 additions & 2 deletions llvm/lib/Target/X86/X86MCInstLower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -192,8 +192,15 @@ MCSymbol *X86MCInstLower::GetSymbolFromOperand(const MachineOperand &MO) const {
}

Name += Suffix;
if (!Sym)
Sym = Ctx.getOrCreateSymbol(Name);
if (!Sym) {
// If new MCSymbol needs to be created for
// MachineOperand::MO_ExternalSymbol, create it as a symbol
// in AsmPrinter's OutContext.
if (MO.isSymbol())
Sym = AsmPrinter.OutContext.getOrCreateSymbol(Name);
else
Sym = Ctx.getOrCreateSymbol(Name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. This is starting to make more sense, now. Normally, there's only one MCContext, but because you're doing this "parallel codegen" thing, you're passing a different MCContext to construct the MachineFunction... and the coupling between a symbol and its context is loose enough that you can pass a symbol from the wrong MCContext to the AsmPrinter, and sort of get away with it.

If preventing that is the goal, why not change X86MCInstLower::X86MCInstLower instead? Just change Ctx(mf.getContext()) to Ctx(asmprinter.OutContext).

That said, more generally, if you want everything to work reliably, you'll probably need to completely eliminate the MCContext reference from MachineFunction. Otherwise, you'll continue to run into issues where the MCSymbol is from the wrong context. Maybe doable, but involves a significant amount of refactoring to avoid constructing MCSymbols early... and I'm not sure if there are any other weird interactions here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If preventing that is the goal, why not change X86MCInstLower::X86MCInstLower instead? Just change Ctx(mf.getContext()) to Ctx(asmprinter.OutContext).

Hmmm, this probably won't work (I mean it will work for MO_ExternalSymbol to get a unique MCSymbol, but) because we do still need mf.getContext() to do most of the codegen here for this function pass to run on the corresponding MachineFunction. Ctx has to be mf.getContext() as most of the codegen for this MachineFunction is in this MCContext. AsmPrinter's OutContext is just for the output here.

Yeah, for most cases, AsmPrinter.OutContext is the same as mf.getContext because there is just one MCContext in the whole pipeline. I'm not sure what is the motivation for other backends to also use AsmPrinter.OutContext for the same case here, but the change does make them look more consistent with each other 😀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

said, more generally, if you want everything to work reliably, you'll probably need to completely eliminate the MCContext reference from MachineFunction. Otherwise, you'll continue to run into issues where the MCSymbol is from the wrong context. Maybe doable, but involves a significant amount of refactoring to avoid constructing MCSymbols early... and I'm not sure if there are any other weird interactions here.

Interesting idea, but I do need to think more about what does this entails, and as you said, it will probably be a significant amount of refactoring, so maybe as follow-ups with more concrete considerations on a larger scale refactoring?

(Also very selfishly speaking, this PR will help significantly with our MCLinker to function correctly, otherwise, half of tests are failing now 😢, so would be great to get something in first so that we can keep upgrading weekly 🙏 . )

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ctx has to be mf.getContext() as most of the codegen for this MachineFunction is in this MCContext.

I'm not really understanding what this means... I guess you've sort of informally partitioned things so that "local" symbols are created in the function's MCContext, and "global" symbols are created in the global MCContext? I don't think that's really sustainable... if we're going to consistently partition symbols, the two kinds of symbols can't both just be MCSymbol*, or you'll inevitably trip over issues in more obscure cases where we use the wrong MCContext.


I don't think the patch in its current form will cause any immediate issues, but I don't want to commit to a design that hasn't really been reviewed by LLVM community, and probably won't be approved in its current form because it's very confusing.

Copy link
Contributor Author

@weiweichen weiweichen Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ctx has to be mf.getContext() as most of the codegen for this MachineFunction is in this MCContext.

I'm not really understanding what this means... I guess you've sort of informally partitioned things so that "local" symbols are created in the function's MCContext, and "global" symbols are created in the global MCContext? I don't think that's really sustainable... if we're going to consistently partition symbols, the two kinds of symbols can't both just be MCSymbol*, or you'll inevitably trip over issues in more obscure cases where we use the wrong MCContext.

This is valid concern, but we are willing to take the risk for potentially running into issues (since we don't have substantial example right now to show this will be a big problem and worrying about "what-if-sm" is hard 😞) in the future at this point to unblock all of our tests. I'm definitely open to suggestion on how to make the MCLinker more solid (maybe after/during my talk next month?).

I don't think the patch in its current form will cause any immediate issues, but I don't want to commit to a design that hasn't really been reviewed by LLVM community, and probably won't be approved in its current form because it's very confusing.

I understand your reservation and being careful, thank you for being thorough here! Though I want to point out that this change has already been applied to most of other backends (and I imagine those changes were being reviewed and approved by the community?). So I'd push back a bit on the assumption that this is "a design that hasn't really been reviewed by LLVM community"?

and probably won't be approved in its current form because it's very confusing.

Do you have any suggestion on which part is confusing and what can be added to make it less so? 🙏

Copy link
Contributor Author

@weiweichen weiweichen Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's really sustainable... if we're going to consistently partition symbols, the two kinds of symbols can't both just be MCSymbol*, or you'll inevitably trip over issues in more obscure cases where we use the wrong MCContext.

Why can't the two kinds be both "MCSymbo*" just because their scopes are different (local vs global)? What's wrong with `MCSymbo' wrt scoping? Also, I don't quite get "inevitably trip over issues in more obscure cases" part, I'm afraid it's a bit vague statement. Could you be more specific on what cases that may be? And since most of other backends are already doing this, do you think they are also doing something unsustainable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efriedma-quic the absence of a fix blocks our pulldown as 50% of tests are failing due to reverted #133291 (comment) change.

Even though I agree that there might be a better "bulletproof" solution, but is it ok to move forward with Weiwei's fix first and then consider much better solution ? My rationale here is that Weiwei's proposed fix is aligned with what most of the targets are doing now and also there was an interest in community to open source parallel MCLinker. So when it comes to open sourcing there will be more use-cases and we will come back to that topic again.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change has already been applied to most of other backends

The other backends work in your model by accident. They're not intentionally using one context or the other, they're just not using MachineFunction::getContext() at all in the AsmPrinter.

Could you be more specific on what cases that may be?
Do you have any suggestion on which part is confusing and what can be added to make it less so? 🙏

Basically, if MF.getContext().getOrCreateSymbol() means something different from AsmPrinter.OutContext.getOrCreateSymbol(), it's a lot harder to understand what's going on. You have two APIs that look the same on the surface, and return a value of the same type, but actually mean something subtly different. Anything that creates an MCSymbol would need to be aware that both APIs exist, and pick the correct one. Some existing code probably isn't consistent with your model, and anyone writing new code gets zero guidance from the API names or type system to help pick the correct one.

And the "local" one is never really right: MCSymbols passed to an MCStreamer are supposed to be part of the same MCContext as the MCStreamer.

To make things consistent, there needs to be one correct way to construct an MCSymbol. Either we need a "MachineFunctionSymbol" type to represent symbols that haven't been emitted yet, or you need to stop sharing MCStreamers between different compilation units. This patch isn't making any progress towards either of those models, or anything similar.


I don't think this patch will cause any immediate issues, because it's basically a no-op if there's only one MCContext, but this isn't the right long-term solution. And I don't want to commit to merging an unbounded number of temporary hacks while you work out how to implement the right solution.

}

// If the target flags on the operand changes the name of the symbol, do that
// before we return the symbol.
Expand Down
1 change: 1 addition & 0 deletions llvm/unittests/CodeGen/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ add_llvm_unittest(CodeGenTests
TargetOptionsTest.cpp
TestAsmPrinter.cpp
MLRegAllocDevelopmentFeatures.cpp
X86MCInstLowerTest.cpp
)

add_subdirectory(GlobalISel)
Expand Down
180 changes: 180 additions & 0 deletions llvm/unittests/CodeGen/X86MCInstLowerTest.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
//===- llvm/unittest/CodeGen/AArch64SelectionDAGTest.cpp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File name is incorrect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, oops, fixed!

//-------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "../lib/Target/X86/X86ISelLowering.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the header used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, removed

#include "TestAsmPrinter.h"
#include "llvm/Analysis/MemoryLocation.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/AsmParser/Parser.h"
#include "llvm/CodeGen/AsmPrinter.h"
#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/SelectionDAG.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this test uses anything from SelectionDAG.h

#include "llvm/CodeGen/TargetLowering.h"
#include "llvm/CodeGen/TargetPassConfig.h"
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/IR/MDBuilder.h"
#include "llvm/IR/Module.h"
#include "llvm/MC/MCStreamer.h"
#include "llvm/MC/TargetRegistry.h"
#include "llvm/Support/KnownBits.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this file uses KnownBits.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, removed

#include "llvm/Support/SourceMgr.h"
#include "llvm/Support/TargetSelect.h"
#include "llvm/Target/TargetLoweringObjectFile.h"
#include "llvm/Target/TargetMachine.h"
#include "llvm/Testing/Support/Error.h"
#include "gmock/gmock.h"
#include "gtest/gtest.h"

namespace llvm {

class X86MCInstLowerTest : public testing::Test {
protected:
static void SetUpTestCase() {
LLVMInitializeX86TargetInfo();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the X86 target isn't built, can you still call these funtions or do they not exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like compilation will fail if I'm not building X86 backend. Changing this to

    InitializeAllTargetMCs();
    InitializeAllTargetInfos();
    InitializeAllTargets();
    InitializeAllAsmPrinters();

Tested with only building AArch64 backend, this test gets skipped.

[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from X86MCInstLowerTest
[ RUN      ] X86MCInstLowerTest.moExternalSymbol_MCSYMBOL
/Users/weiwei.chen/research/modularml/modular/third-party/llvm-project/llvm/unittests/CodeGen/X86MCInstLowerTest.cpp:107: Skipped


[  SKIPPED ] X86MCInstLowerTest.moExternalSymbol_MCSYMBOL (1 ms)
[----------] 1 test from X86MCInstLowerTest (1 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (2 ms total)
[  PASSED  ] 0 tests.
[  SKIPPED ] 1 test, listed below:
[  SKIPPED ] X86MCInstLowerTest.moExternalSymbol_MCSYMBOL

LLVMInitializeX86TargetMC();
LLVMInitializeX86Target();
LLVMInitializeX86AsmPrinter();
}

// Function to setup codegen pipeline and returns the AsmPrinter.
AsmPrinter *addPassesToEmitFile(llvm::legacy::PassManagerBase &PM,
llvm::raw_pwrite_stream &Out,
llvm::CodeGenFileType FileType,
llvm::MachineModuleInfoWrapperPass *MMIWP) {
TargetPassConfig *PassConfig = TM->createPassConfig(PM);

PassConfig->setDisableVerify(true);
PM.add(PassConfig);
PM.add(MMIWP);

if (PassConfig->addISelPasses())
return nullptr;

PassConfig->addMachinePasses();
PassConfig->setInitialized();

MC.reset(new MCContext(TM->getTargetTriple(), TM->getMCAsmInfo(),
TM->getMCRegisterInfo(), TM->getMCSubtargetInfo()));
MC->setObjectFileInfo(TM->getObjFileLowering());
TM->getObjFileLowering()->Initialize(*MC, *TM);
MC->setObjectFileInfo(TM->getObjFileLowering());

// Use a new MCContext for AsmPrinter for testing.
// AsmPrinter.OutContext will be different from
// MachineFunction's MCContext in MMIWP.
Expected<std::unique_ptr<MCStreamer>> MCStreamerOrErr =
TM->createMCStreamer(Out, nullptr, FileType, *MC);

if (auto Err = MCStreamerOrErr.takeError())
return nullptr;

AsmPrinter *Printer =
TM->getTarget().createAsmPrinter(*TM, std::move(*MCStreamerOrErr));

if (!Printer)
return nullptr;

PM.add(Printer);

return Printer;
}

void SetUp() override {
// Module to compile.
const char *FooStr = R""""(
@G = external global i32
define i32 @foo() {
%1 = load i32, i32* @G; load the global variable
%2 = call i32 @f()
%3 = mul i32 %1, %2
ret i32 %3
}
declare i32 @f() #0
)"""";
StringRef AssemblyF(FooStr);

// Get target triple for X86_64
Triple TargetTriple("x86_64--");
std::string Error;
const Target *T = TargetRegistry::lookupTarget("", TargetTriple, Error);
if (!T)
GTEST_SKIP();

// Get TargetMachine.
// Use Reloc::Model::PIC_ and CodeModel::Model::Large
// to get GOT during codegen as MO_ExternalSymbol.
TargetOptions Options;
TM = std::unique_ptr<TargetMachine>(T->createTargetMachine(
TargetTriple, "", "", Options, Reloc::Model::PIC_,
CodeModel::Model::Large, CodeGenOptLevel::Default));
if (!TM)
GTEST_SKIP();

SMDiagnostic SMError;

// Parse the module.
M = parseAssemblyString(AssemblyF, SMError, Context);
if (!M)
report_fatal_error(SMError.getMessage());
M->setDataLayout(TM->createDataLayout());

// Get llvm::Function from M
Foo = M->getFunction("foo");
if (!Foo)
report_fatal_error("foo?");

// Prepare the MCContext for codegen M.
// MachineFunction for Foo will have this MCContext.
MCFoo.reset(new MCContext(TargetTriple, TM->getMCAsmInfo(),
TM->getMCRegisterInfo(),
TM->getMCSubtargetInfo()));
MCFoo->setObjectFileInfo(TM->getObjFileLowering());
TM->getObjFileLowering()->Initialize(*MCFoo, *TM);
MCFoo->setObjectFileInfo(TM->getObjFileLowering());
}

LLVMContext Context;
std::unique_ptr<TargetMachine> TM;
std::unique_ptr<Module> M;

std::unique_ptr<MCContext> MC;
std::unique_ptr<MCContext> MCFoo;

Function *Foo;
std::unique_ptr<MachineFunction> MFFoo;
};

TEST_F(X86MCInstLowerTest, moExternalSymbol_MCSYMBOL) {

MachineModuleInfoWrapperPass *MMIWP =
new MachineModuleInfoWrapperPass(TM.get(), &*MCFoo);

legacy::PassManager PassMgrF;
SmallString<1024> Buf;
llvm::raw_svector_ostream OS(Buf);
AsmPrinter *Printer =
addPassesToEmitFile(PassMgrF, OS, CodeGenFileType::AssemblyFile, MMIWP);
PassMgrF.run(*M);

// Check GOT MCSymbol is from Printer.OutContext.
MCSymbol *GOTPrinterPtr =
Printer->OutContext.lookupSymbol("_GLOBAL_OFFSET_TABLE_");

// Check GOT MCSymbol is NOT from MachineFunction's MCContext.
MCSymbol *GOTMFCtxPtr =
MMIWP->getMMI().getMachineFunction(*Foo)->getContext().lookupSymbol(
"_GLOBAL_OFFSET_TABLE_");

EXPECT_NE(GOTPrinterPtr, nullptr);
EXPECT_EQ(GOTMFCtxPtr, nullptr);
}

} // end namespace llvm