-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Put large common blocks into .lbss for the medium and large code models #161483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
For the medium and large code models on x86-64, the bss is split into two parts, .bss for regular sized data, and .lbss for data larger than that specified by -large-data-threshold (for the medium code model). This change tries to mimick what gcc does for the medium and large code models for Fortran COMMON blocks: program test implicit integer (i-n) implicit real*8 (a-h, o-z) parameter (n1=77777, n2=77777) common v2(n1,n2) common /ccc/ v6 v6 = 1 v2(6777,6777) = 2 write (*,*) v6, v2(6777, 6777) end program test Currently, we generate: 0000000000000008 O *COM* 0000000000000008 ccc_ 0000000b44834508 O *COM* 0000000000000008 __BLNK and ld and lld both fail to link with -mcmodel=medium. /usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax or ld.lld: error: /usr/lib/gcc/x86_64-redhat-linux/11/crtbeginS.o:(.text+0x76): relocation R_X86_64_PC32 out of range: 48394103061 is not in [-2147483648, 2147483647]; references section '.bss' With this change, __BLNK is marked as LARGE_COMMON (SHN_AMD64_LCOMMON), and ends up in .lbss. *.o: Sections: ... 1 .lbss 00000000 0000000000000000 0000000000000000 000000fb 2**0 ALLOC ... SYMBOL TABLE: 0000000000000008 O *COM* 0000000000000008 ccc_ 0000000b44834508 O LARGE_COMMON 0000000000000008 __BLNK__ a.out: Sections: ... 25 .bss 00000009 0000000000003cf0 0000000000003cf0 00000cf0 2**3 ALLOC 26 .lbss b44834508 0000000000003d00 0000000000003d00 00000cf0 2**3 ALLOC SYMBOL TABLE: 0000000000003cf0 g O .bss 0000000000000008 ccc_ 0000000000003d00 g O .lbss 0000000b44834508 __BLNK__ In the assembly, this will appear as: .section .lbss,"awl",@nobits .type __BLNK__,@object .largecomm __BLNK__,48394093832,8 .type ccc_,@object .comm ccc_,8,8 For the large code model, all common blocks will end up in lbss. Previously, the above example wouldn't link statically with GNU ld. The change also includes support for parsing the largecomm directive. There will be a PR for the chages to lld to handle the SHN_AMD64_LCOMMON section.
|
||
// Place the large common variables in the .lbss section for the medium and | ||
// large code models on x86_64. (AMD64 ABI, 9.2.5 COMMON blocks) | ||
MCSection *largeCommonSec = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LargeCommonSec
return MCAsmParserExtension::parseDirectiveCGProfile(S, Loc); | ||
} | ||
|
||
bool ELFAsmParser::parseDirectiveLargecomm(StringRef s, SMLoc Loc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StringRef S
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-llvm-mc Author: None (ssijaric-nv) ChangesFor the medium and large code models on x86-64, the bss is split into two parts, program test Currently, we generate: 0000000000000008 O COM 0000000000000008 ccc_ and ld and lld both fail to link with -mcmodel=medium. /usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax or ld.lld: error: /usr/lib/gcc/x86_64-redhat-linux/11/crtbeginS.o:(.text+0x76): relocation R_X86_64_PC32 out of range: 48394103061 is not in [-2147483648, 2147483647]; references section '.bss' With this change, __BLNK is marked as LARGE_COMMON (SHN_AMD64_LCOMMON), *.o: a.out: SYMBOL TABLE: In the assembly, this will appear as: .section .lbss,"awl",@nobits For the large code model, all common blocks will end up in lbss. Previously, the above example wouldn't link statically with GNU ld. The change also includes support for parsing the largecomm directive. Full diff: https://github.com/llvm/llvm-project/pull/161483.diff 16 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h
index e619b186dfe3d..ca6ec30e1f95e 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -589,6 +589,11 @@ enum {
SHN_MIPS_SUNDEFINED = 0xff04 // Undefined symbols for global data area
};
+// x86-64 speciifc section index
+enum {
+ SHN_AMD64_LCOMMON = 0xff02, // Large FORTRAN COMMON variables
+};
+
// ELF Relocation types for Mips
enum {
#include "ELFRelocs/Mips.def"
diff --git a/llvm/include/llvm/MC/MCELFStreamer.h b/llvm/include/llvm/MC/MCELFStreamer.h
index 144f6bc3bd91c..1b6e4273b1e28 100644
--- a/llvm/include/llvm/MC/MCELFStreamer.h
+++ b/llvm/include/llvm/MC/MCELFStreamer.h
@@ -56,6 +56,8 @@ class MCELFStreamer : public MCObjectStreamer {
bool emitSymbolAttribute(MCSymbol *Symbol, MCSymbolAttr Attribute) override;
void emitCommonSymbol(MCSymbol *Symbol, uint64_t Size,
Align ByteAlignment) override;
+ void emitLargeCommonSymbol(MCSymbol *Symbol, uint64_t Size,
+ Align ByteAlignment) override;
void emitELFSize(MCSymbol *Symbol, const MCExpr *Value) override;
void emitELFSymverDirective(const MCSymbol *OriginalSym, StringRef Name,
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 79c715e3820a6..194cc799d5650 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -677,6 +677,14 @@ class LLVM_ABI MCStreamer {
virtual void emitCommonSymbol(MCSymbol *Symbol, uint64_t Size,
Align ByteAlignment) = 0;
+ /// Emit a large common (.largecomm) symbol.
+ ///
+ /// \param Symbol - The common symbol to emit.
+ /// \param Size - The size of the common symbol.
+ /// \param ByteAlignment - The alignment of the common symbol in bytes.
+ virtual void emitLargeCommonSymbol(MCSymbol *Symbol, uint64_t Size,
+ Align ByteAlignment);
+
/// Emit a local common (.lcomm) symbol.
///
/// \param Symbol - The common symbol to emit.
diff --git a/llvm/include/llvm/MC/MCSymbolELF.h b/llvm/include/llvm/MC/MCSymbolELF.h
index 1af175a18e8ec..e2377eff688c6 100644
--- a/llvm/include/llvm/MC/MCSymbolELF.h
+++ b/llvm/include/llvm/MC/MCSymbolELF.h
@@ -17,6 +17,7 @@ class MCSymbolELF : public MCSymbol {
/// An expression describing how to calculate the size of a symbol. If a
/// symbol has no size this field will be NULL.
const MCExpr *SymbolSize = nullptr;
+ bool IsLargeCommon = false;
public:
MCSymbolELF(const MCSymbolTableEntry *Name, bool isTemporary)
@@ -48,6 +49,9 @@ class MCSymbolELF : public MCSymbol {
void setMemtag(bool Tagged);
bool isMemtag() const;
+ bool isLargeCommon() const { return IsLargeCommon; }
+ void setIsLargeCommon(bool V) { IsLargeCommon = V; }
+
private:
void setIsBindingSet() const;
};
diff --git a/llvm/include/llvm/Object/ELFTypes.h b/llvm/include/llvm/Object/ELFTypes.h
index 5a26e2fc31458..ed4695dccb6a8 100644
--- a/llvm/include/llvm/Object/ELFTypes.h
+++ b/llvm/include/llvm/Object/ELFTypes.h
@@ -252,6 +252,8 @@ struct Elf_Sym_Impl : Elf_Sym_Base<ELFT> {
return getType() == ELF::STT_COMMON || st_shndx == ELF::SHN_COMMON;
}
+ bool isLargeCommon() const { return st_shndx == ELF::SHN_AMD64_LCOMMON; }
+
bool isDefined() const { return !isUndefined(); }
bool isProcessorSpecific() const {
diff --git a/llvm/include/llvm/Target/TargetLoweringObjectFile.h b/llvm/include/llvm/Target/TargetLoweringObjectFile.h
index 4d6cbc5540131..d1c16a3f4578f 100644
--- a/llvm/include/llvm/Target/TargetLoweringObjectFile.h
+++ b/llvm/include/llvm/Target/TargetLoweringObjectFile.h
@@ -305,6 +305,12 @@ class LLVM_ABI TargetLoweringObjectFile : public MCObjectFileInfo {
return nullptr;
}
+ virtual MCSection *LargeSectionForCommon(const GlobalObject *GO,
+ SectionKind Kind,
+ const TargetMachine &TM) const {
+ return nullptr;
+ }
+
protected:
virtual MCSection *SelectSectionForGlobal(const GlobalObject *GO,
SectionKind Kind,
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 701a6a2f0f7a0..9240e18e12443 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -782,11 +782,18 @@ void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) {
OutContext.reportError(SMLoc(), "symbol '" + Twine(GVSym->getName()) +
"' is already defined");
+ SectionKind GVKind = TargetLoweringObjectFile::getKindForGlobal(GV, TM);
+
+ // Place the large common variables in the .lbss section for the medium and
+ // large code models on x86_64. (AMD64 ABI, 9.2.5 COMMON blocks)
+ MCSection *LargeCommonSec =
+ getObjFileLowering().LargeSectionForCommon(GV, GVKind, TM);
+ if (LargeCommonSec)
+ OutStreamer->switchSection(LargeCommonSec);
+
if (MAI->hasDotTypeDotSizeDirective())
OutStreamer->emitSymbolAttribute(EmittedSym, MCSA_ELF_TypeObject);
- SectionKind GVKind = TargetLoweringObjectFile::getKindForGlobal(GV, TM);
-
const DataLayout &DL = GV->getDataLayout();
uint64_t Size = DL.getTypeAllocSize(GV->getValueType());
@@ -802,10 +809,12 @@ void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) {
if (GVKind.isCommon()) {
if (Size == 0) Size = 1; // .comm Foo, 0 is undefined, avoid it.
// .comm _foo, 42, 4
- OutStreamer->emitCommonSymbol(GVSym, Size, Alignment);
+ if (LargeCommonSec)
+ OutStreamer->emitLargeCommonSymbol(GVSym, Size, Alignment);
+ else
+ OutStreamer->emitCommonSymbol(GVSym, Size, Alignment);
return;
}
-
// Determine to which section this global should be emitted.
MCSection *TheSection = getObjFileLowering().SectionForGlobal(GV, GVKind, TM);
diff --git a/llvm/lib/MC/ELFObjectWriter.cpp b/llvm/lib/MC/ELFObjectWriter.cpp
index 759d3e0e14291..aa19e92ad74d3 100644
--- a/llvm/lib/MC/ELFObjectWriter.cpp
+++ b/llvm/lib/MC/ELFObjectWriter.cpp
@@ -27,6 +27,7 @@
#include "llvm/MC/MCELFObjectWriter.h"
#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCFixup.h"
+#include "llvm/MC/MCObjectFileInfo.h"
#include "llvm/MC/MCObjectWriter.h"
#include "llvm/MC/MCSection.h"
#include "llvm/MC/MCSectionELF.h"
@@ -544,7 +545,10 @@ void ELFWriter::computeSymbolTable(const RevGroupMapTy &RevGroupMap) {
auto Shndx = Symbol.getIndex();
if (!Shndx) {
assert(!Local);
- Shndx = ELF::SHN_COMMON;
+ if (Symbol.isLargeCommon())
+ Shndx = ELF::SHN_AMD64_LCOMMON;
+ else
+ Shndx = ELF::SHN_COMMON;
}
MSD.SectionIndex = Shndx;
} else if (Symbol.isUndefined()) {
diff --git a/llvm/lib/MC/MCAsmStreamer.cpp b/llvm/lib/MC/MCAsmStreamer.cpp
index be8c022f39ad1..903ddcc8cdd39 100644
--- a/llvm/lib/MC/MCAsmStreamer.cpp
+++ b/llvm/lib/MC/MCAsmStreamer.cpp
@@ -228,7 +228,8 @@ class MCAsmStreamer final : public MCStreamer {
void emitELFSize(MCSymbol *Symbol, const MCExpr *Value) override;
void emitCommonSymbol(MCSymbol *Symbol, uint64_t Size,
Align ByteAlignment) override;
-
+ void emitLargeCommonSymbol(MCSymbol *Symbol, uint64_t Size,
+ Align ByteAlignment) override;
/// Emit a local common (.lcomm) symbol.
///
/// @param Symbol - The common symbol to emit.
@@ -1075,6 +1076,19 @@ void MCAsmStreamer::emitCommonSymbol(MCSymbol *Symbol, uint64_t Size,
}
}
+void MCAsmStreamer::emitLargeCommonSymbol(MCSymbol *Symbol, uint64_t Size,
+ Align ByteAlignment) {
+ OS << "\t.largecomm\t";
+ Symbol->print(OS, MAI);
+ OS << ',' << Size;
+
+ if (MAI->getCOMMDirectiveAlignmentIsInBytes())
+ OS << ',' << ByteAlignment.value();
+ else
+ OS << ',' << Log2(ByteAlignment);
+ EmitEOL();
+}
+
void MCAsmStreamer::emitLocalCommonSymbol(MCSymbol *Symbol, uint64_t Size,
Align ByteAlign) {
OS << "\t.lcomm\t";
diff --git a/llvm/lib/MC/MCELFStreamer.cpp b/llvm/lib/MC/MCELFStreamer.cpp
index 2881d7cfab4ba..8314fe4d2f14c 100644
--- a/llvm/lib/MC/MCELFStreamer.cpp
+++ b/llvm/lib/MC/MCELFStreamer.cpp
@@ -244,6 +244,23 @@ bool MCELFStreamer::emitSymbolAttribute(MCSymbol *S, MCSymbolAttr Attribute) {
return true;
}
+void MCELFStreamer::emitLargeCommonSymbol(MCSymbol *S, uint64_t Size,
+ Align ByteAlignment) {
+ auto *Symbol = static_cast<MCSymbolELF *>(S);
+ getAssembler().registerSymbol(*Symbol);
+
+ if (!Symbol->isBindingSet())
+ Symbol->setBinding(ELF::STB_GLOBAL);
+
+ Symbol->setType(ELF::STT_OBJECT);
+
+ if (Symbol->declareCommon(Size, ByteAlignment))
+ report_fatal_error(Twine("Symbol: ") + Symbol->getName() +
+ " redeclared as different type");
+ Symbol->setIsLargeCommon(true);
+ Symbol->setSize(MCConstantExpr::create(Size, getContext()));
+}
+
void MCELFStreamer::emitCommonSymbol(MCSymbol *S, uint64_t Size,
Align ByteAlignment) {
auto *Symbol = static_cast<MCSymbolELF *>(S);
diff --git a/llvm/lib/MC/MCParser/ELFAsmParser.cpp b/llvm/lib/MC/MCParser/ELFAsmParser.cpp
index 6195355626fd5..237b430e3d7b4 100644
--- a/llvm/lib/MC/MCParser/ELFAsmParser.cpp
+++ b/llvm/lib/MC/MCParser/ELFAsmParser.cpp
@@ -77,6 +77,7 @@ class ELFAsmParser : public MCAsmParserExtension {
&ELFAsmParser::parseDirectiveSymbolAttribute>(".hidden");
addDirectiveHandler<&ELFAsmParser::parseDirectiveSubsection>(".subsection");
addDirectiveHandler<&ELFAsmParser::parseDirectiveCGProfile>(".cg_profile");
+ addDirectiveHandler<&ELFAsmParser::parseDirectiveLargecomm>(".largecomm");
}
// FIXME: Part of this logic is duplicated in the MCELFStreamer. What is
@@ -126,6 +127,7 @@ class ELFAsmParser : public MCAsmParserExtension {
bool parseDirectiveSymbolAttribute(StringRef, SMLoc);
bool parseDirectiveSubsection(StringRef, SMLoc);
bool parseDirectiveCGProfile(StringRef, SMLoc);
+ bool parseDirectiveLargecomm(StringRef, SMLoc);
private:
bool parseSectionName(StringRef &SectionName);
@@ -886,6 +888,54 @@ bool ELFAsmParser::parseDirectiveCGProfile(StringRef S, SMLoc Loc) {
return MCAsmParserExtension::parseDirectiveCGProfile(S, Loc);
}
+bool ELFAsmParser::parseDirectiveLargecomm(StringRef, SMLoc Loc) {
+ if (getParser().checkForValidSection())
+ return true;
+
+ MCSymbol *Sym;
+ if (getParser().parseSymbol(Sym))
+ return TokError("expected identifier in directive");
+
+ if (getLexer().isNot(AsmToken::Comma))
+ return TokError("expected a comma");
+ Lex();
+
+ int64_t Size;
+ SMLoc SizeLoc = getLexer().getLoc();
+ if (getParser().parseAbsoluteExpression(Size))
+ return true;
+
+ int64_t Pow2Alignment = 0;
+ SMLoc Pow2AlignmentLoc;
+ if (getLexer().is(AsmToken::Comma)) {
+ Lex();
+ Pow2AlignmentLoc = getLexer().getLoc();
+ if (getParser().parseAbsoluteExpression(Pow2Alignment))
+ return true;
+
+ // If this target takes alignments in bytes (not log) validate and convert.
+ if (getLexer().getMAI().getCOMMDirectiveAlignmentIsInBytes()) {
+ if (!isPowerOf2_64(Pow2Alignment))
+ return Error(Pow2AlignmentLoc, "alignment must be a power of 2");
+ Pow2Alignment = Log2_64(Pow2Alignment);
+ }
+ }
+
+ if (parseEOL())
+ return true;
+
+ if (Size < 0)
+ return Error(SizeLoc, "size must be non-negative");
+
+ Sym->redefineIfPossible();
+ if (!Sym->isUndefined())
+ return Error(Loc, "invalid symbol redefinition");
+
+ getStreamer().emitLargeCommonSymbol(Sym, Size, Align(1ULL << Pow2Alignment));
+
+ return false;
+}
+
namespace llvm {
MCAsmParserExtension *createELFAsmParser() {
diff --git a/llvm/lib/MC/MCStreamer.cpp b/llvm/lib/MC/MCStreamer.cpp
index bc7398120096e..f4f7d78ec557f 100644
--- a/llvm/lib/MC/MCStreamer.cpp
+++ b/llvm/lib/MC/MCStreamer.cpp
@@ -1309,6 +1309,8 @@ void MCStreamer::emitELFSymverDirective(const MCSymbol *OriginalSym,
StringRef Name, bool KeepOriginalSym) {}
void MCStreamer::emitLocalCommonSymbol(MCSymbol *Symbol, uint64_t Size,
Align ByteAlignment) {}
+void MCStreamer::emitLargeCommonSymbol(MCSymbol *Symbol, uint64_t Size,
+ Align ByteAlignment) {}
void MCStreamer::emitZerofill(MCSection *, MCSymbol *, uint64_t, Align, SMLoc) {
}
void MCStreamer::emitTBSSSymbol(MCSection *Section, MCSymbol *Symbol,
diff --git a/llvm/lib/Target/X86/X86TargetObjectFile.cpp b/llvm/lib/Target/X86/X86TargetObjectFile.cpp
index fc3c06fcc1c9b..60e6f090652de 100644
--- a/llvm/lib/Target/X86/X86TargetObjectFile.cpp
+++ b/llvm/lib/Target/X86/X86TargetObjectFile.cpp
@@ -8,8 +8,11 @@
#include "X86TargetObjectFile.h"
#include "MCTargetDesc/X86MCAsmInfo.h"
+#include "X86TargetMachine.h"
#include "llvm/BinaryFormat/Dwarf.h"
+#include "llvm/BinaryFormat/ELF.h"
#include "llvm/MC/MCExpr.h"
+#include "llvm/MC/MCSectionELF.h"
#include "llvm/MC/MCValue.h"
#include "llvm/Target/TargetMachine.h"
@@ -71,3 +74,13 @@ const MCExpr *X86_64ELFTargetObjectFile::getIndirectSymViaGOTPCRel(
const MCExpr *Off = MCConstantExpr::create(FinalOffset, getContext());
return MCBinaryExpr::createAdd(Res, Off, getContext());
}
+
+MCSection *X86_64ELFTargetObjectFile::LargeSectionForCommon(
+ const GlobalObject *GV, SectionKind Kind, const TargetMachine &TM) const {
+ auto &X86_TM = static_cast<const X86TargetMachine &>(TM);
+ if (GV && Kind.isCommon() && TM.isLargeGlobalValue(GV) && !X86_TM.isJIT()) {
+ unsigned Flags = ELF::SHF_ALLOC | ELF::SHF_WRITE | ELF::SHF_X86_64_LARGE;
+ return getContext().getELFSection(".lbss", ELF::SHT_NOBITS, Flags);
+ }
+ return nullptr;
+}
diff --git a/llvm/lib/Target/X86/X86TargetObjectFile.h b/llvm/lib/Target/X86/X86TargetObjectFile.h
index cb096f4a9febb..4fc944302fcf6 100644
--- a/llvm/lib/Target/X86/X86TargetObjectFile.h
+++ b/llvm/lib/Target/X86/X86TargetObjectFile.h
@@ -56,6 +56,9 @@ namespace llvm {
const MCValue &MV, int64_t Offset,
MachineModuleInfo *MMI,
MCStreamer &Streamer) const override;
+
+ MCSection *LargeSectionForCommon(const GlobalObject *GO, SectionKind Kind,
+ const TargetMachine &TM) const override;
};
} // end namespace llvm
diff --git a/llvm/test/MC/X86/largecomm.ll b/llvm/test/MC/X86/largecomm.ll
new file mode 100644
index 0000000000000..52d51d562c47e
--- /dev/null
+++ b/llvm/test/MC/X86/largecomm.ll
@@ -0,0 +1,58 @@
+; RUN: llc -filetype=asm -code-model=medium %s --large-data-threshold=65636 -o - | FileCheck %s --check-prefix=CHECKASM-MEDIUM
+; RUN: llc -filetype=asm -code-model=large %s -o - | FileCheck %s --check-prefix=CHECKASM-LARGE
+; RUN: llc -filetype=asm -code-model=medium %s --large-data-threshold=65636 -o - | llvm-mc -triple x86_64-linux-gnu -filetype=obj - | llvm-readelf -s - | FileCheck %s --check-prefix=CHECKOBJ-MEDIUM
+; RUN: llc -filetype=asm -code-model=large %s -o - | llvm-mc -triple x86_64-linux-gnu -filetype=obj - | llvm-readelf -s - | FileCheck %s --check-prefix=CHECKOBJ-LARGE
+
+; CHECKASM-MEDIUM: .section .lbss,"awl",@nobits
+; CHECKASM-MEDIUM-NEXT: .type __BLNK__,@object # @__BLNK__
+; CHECKASM-MEDIUM-NEXT: .largecomm __BLNK__,48394093832,8
+; CHECKASM-MEDIUM-NEXT: .type ccc_,@object # @ccc_
+; CHECKASM-MEDIUM-NEXT: .comm ccc_,8,8
+
+; CHECKASM-LARGE: .section .lbss,"awl",@nobits
+; CHECKASM-LARGE-NEXT: .type __BLNK__,@object # @__BLNK__
+; CHECKASM-LARGE-NEXT: .largecomm __BLNK__,48394093832,8
+; CHECKASM-LARGE-NEXT: .type ccc_,@object # @ccc_
+; CHECKASM-LARGE-NEXT: .largecomm ccc_,8,8
+
+; CHECKOBJ-MEDIUM: 8 OBJECT GLOBAL DEFAULT COM ccc_
+; CHECKOBJ-MEDIUM: 48394093832 OBJECT GLOBAL DEFAULT LARGE_COMMON __BLNK
+
+; CHECKOBJ-LARGE: 8 OBJECT GLOBAL DEFAULT LARGE_COMMON ccc_
+; CHECKOBJ-LARGE: 48394093832 OBJECT GLOBAL DEFAULT LARGE_COMMON __BLNK__
+
+source_filename = "FIRModule"
+target triple = "x86_64-unknown-linux-gnu"
+
+@__BLNK__ = common global [48394093832 x i8] zeroinitializer, align 8
+@ccc_ = common global [8 x i8] zeroinitializer, align 8
+@_QFECn1 = internal constant i32 77777
+@_QFECn2 = internal constant i32 77777
+
+define void @_QQmain() #0 {
+ store double 1.000000e+00, ptr @ccc_, align 8
+ store double 2.000000e+00, ptr getelementptr inbounds nuw (i8, ptr @__BLNK__, i64 61600176), align 8
+ ret void
+}
+
+declare void @_FortranAProgramStart(i32, ptr, ptr, ptr) #1
+
+declare void @_FortranAProgramEndStatement() #1
+
+define i32 @main(i32 %0, ptr %1, ptr %2) #0 {
+ call void @_FortranAProgramStart(i32 %0, ptr %1, ptr %2, ptr null)
+ call void @_QQmain()
+ call void @_FortranAProgramEndStatement()
+ ret i32 0
+}
+
+attributes #0 = { "frame-pointer"="all" "target-cpu"="x86-64" }
+attributes #1 = { "frame-pointer"="all" }
+
+!llvm.ident = !{!0}
+!llvm.module.flags = !{!1, !2, !3}
+
+!0 = !{!"flang version 22.0.0 (https://github.com/llvm/llvm-project.git e1afe25356b8d2ee14f5f88bdb6c2a1526ed14ef)"}
+!1 = !{i32 2, !"Debug Info Version", i32 3}
+!2 = !{i32 8, !"PIC Level", i32 2}
+!3 = !{i32 7, !"PIE Level", i32 2}
diff --git a/llvm/tools/llvm-readobj/ELFDumper.cpp b/llvm/tools/llvm-readobj/ELFDumper.cpp
index ab93316907cc6..7d34727f07f6d 100644
--- a/llvm/tools/llvm-readobj/ELFDumper.cpp
+++ b/llvm/tools/llvm-readobj/ELFDumper.cpp
@@ -1052,6 +1052,8 @@ ELFDumper<ELFT>::getSymbolSectionIndex(const Elf_Sym &Symbol, unsigned SymIndex,
return CreateErr("SHN_ABS");
if (Ndx == ELF::SHN_COMMON)
return CreateErr("SHN_COMMON");
+ if (Ndx == ELF::SHN_AMD64_LCOMMON)
+ return CreateErr("SHN_AMD64_LCOMMON");
return CreateErr("SHN_LORESERVE", Ndx - SHN_LORESERVE);
}
@@ -4308,6 +4310,9 @@ std::string GNUELFDumper<ELFT>::getSymbolSectionNdx(
default:
// Find if:
// Processor specific
+ if (this->Obj.getHeader().e_machine == EM_X86_64 &&
+ SectionIndex == ELF::SHN_AMD64_LCOMMON)
+ return "LARGE_COMMON";
if (SectionIndex >= ELF::SHN_LOPROC && SectionIndex <= ELF::SHN_HIPROC)
return std::string("PRC[0x") +
to_string(format_hex_no_prefix(SectionIndex, 4)) + "]";
|
First, could you provide justification for why we need to make changes related to COMMON blocks? These have been obsolescent since the Fortran 90 standard and were deleted entirely from Fortran 2018. Two main concerns with the proposed changes: Increasing sizeof(MCSymbolELF) negatively impacts memory usage. SHN_AMD64_LCOMMON is a Solaris-specific constant name, not used by glibc/binutils. For 161697, I don't think we should add |
default: | ||
// Find if: | ||
// Processor specific | ||
if (this->Obj.getHeader().e_machine == EM_X86_64 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect to see explicit llvm-readobj tests (i.e. located under test/tools/llvm-readobj) for the new code in ELFDumper.cpp, testing its behaviour. This likely would require yaml2obj support and should be in its own PR (together with the functional llvm-readobj changes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @jh7370. I'll create a separate PR for the llvm-readobj change and add the tests.
Thank you, @MaskRay, for having a look at this. This is an attempt to address #149222 to match the gfortran behavior. The increase in size of MCSymbolELF is concerning. I didn't want to introduce a new kind for this corner case, (e.g. LargeCommon) to the 'enum Kind' in MCSymbol.h, even though there is space in the 'kind' bitfield to hold an additional kind. I'll look for a different way to represent large common symbols without increasing the MCSymbolELF size. Will rename SHN_AMD64_LCOMMON to SHN_X86_64_LCOMMON. For the lld change, I'll remove LbssSection. Thanks. I tested the linker change with the Fortran tests that are included in the llvm-test-suite. I'll see if I can get reliable link times with and without the change with the llvm-test-suite Fortran tests, or a larger application. |
…LFDumper change from this PR
I have some concerns about adding large COMMON support for the obsoleted Fortran feature (and removed from Fortran 2018): Architecture-specific limitation: This feature is x86-64 specific, which introduces architectural divergence for what is essentially a legacy feature. Is this solving a real problem? Could you clarify whether #149222 represents an actual real-world use case? If the goal is to eliminate COMMON blocks entirely, the modern solution is to use modules: define the variable in a module in a single source file and compile it once. This is the recommended practice in modern Fortran.
This approach achieves the same result while working across all architectures, not just x86-64. |
Thanks, @MaskRay, for looking at this. We ran into this issue when trying to build a Fortran SPEC CPU2000 benchmark, where converting common blocks to modules is not feasible. The reduced test case from the issue is representative of the common blocks found in the benchmark. We thought we would match the gfortran behavior, which generates the x86-64 specific section for large common blocks:
where LARGE_COMMON is 0xff02. The llvm-readobj dump of the gfortran produced object is:
where largecomm is x86-64 specific (https://sourceware.org/binutils/docs/as/i386_002dDirectives.html) Thank you for suggesting the COMDAT approach! It's much cleaner. |
Thanks for the comment that this is for a Fortran SPEC CPU2000 benchmark, where converting common blocks to modules is not feasible. The gas
The large data issue is also relevant for aarch64 and riscv64 and I would certainly recommend that they use |
Thanks, @MaskRay. I'll have a look at the COMDAT solution. |
For the medium and large code models on x86-64, the bss is split into two parts,
.bss for regular sized data, and .lbss for data larger than that specified by
-large-data-threshold (for the medium code model). This change tries to mimick
what gcc does for the medium and large code models for Fortran COMMON blocks:
Currently, we generate:
and ld and lld both fail to link with -mcmodel=medium.
/usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax
or
ld.lld: error: /usr/lib/gcc/x86_64-redhat-linux/11/crtbeginS.o:(.text+0x76): relocation R_X86_64_PC32 out of range: 48394103061 is not in [-2147483648, 2147483647]; references section '.bss'
With this change, __BLNK is marked as LARGE_COMMON (SHN_AMD64_LCOMMON),
and ends up in .lbss.
*.o:
Sections:
...
1 .lbss 00000000 0000000000000000 0000000000000000 000000fb 2**0
ALLOC
...
SYMBOL TABLE:
0000000000000008 O COM 0000000000000008 ccc_
0000000b44834508 O LARGE_COMMON 0000000000000008 BLNK
a.out:
Sections:
...
25 .bss 00000009 0000000000003cf0 0000000000003cf0 00000cf0 23
ALLOC
26 .lbss b44834508 0000000000003d00 0000000000003d00 00000cf0 23
ALLOC
SYMBOL TABLE:
0000000000003cf0 g O .bss 0000000000000008 ccc_
0000000000003d00 g O .lbss 0000000b44834508 BLNK
In the assembly, this will appear as:
.section .lbss,"awl",@nobits
.type BLNK,@object
.largecomm BLNK,48394093832,8
.type ccc_,@object
.comm ccc_,8,8
For the large code model, all common blocks will end up in lbss.
Previously, the above example wouldn't link statically with GNU ld.
The change also includes support for parsing the largecomm directive.
There will be a PR for the chages to lld to handle the SHN_AMD64_LCOMMON
section.