-
Couldn't load subscription status.
- Fork 15k
[RISCV][LLD] Zcmt RISC-V extension in lld #163142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
You can test this locally with the following command:git-clang-format --diff origin/main HEAD --extensions cpp,h -- lld/ELF/Arch/RISCV.cpp lld/ELF/Config.h lld/ELF/Driver.cpp lld/ELF/SyntheticSections.cpp lld/ELF/SyntheticSections.h lld/ELF/Target.h lld/ELF/Writer.cpp --diff_from_common_commit
View the diff from clang-format here.diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index e65e1f2c1..2870fd18b 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -4872,11 +4872,12 @@ template <class ELFT> void elf::createSyntheticSections(Ctx &ctx) {
add(*ctx.in.ppc64LongBranchTarget);
}
- if (ctx.arg.emachine == EM_RISCV && ctx.arg.relaxTbljal) {
+ if (ctx.arg.emachine == EM_RISCV && ctx.arg.relaxTbljal) {
ctx.in.riscvTableJumpSection = std::make_unique<TableJumpSection>(ctx);
add(*ctx.in.riscvTableJumpSection);
- Symbol *s = ctx.symtab->addSymbol(Defined{ctx,
+ Symbol *s = ctx.symtab->addSymbol(Defined{
+ ctx,
/*file=*/nullptr, "__jvt_base$", STB_GLOBAL, STT_NOTYPE, STT_NOTYPE,
/*value=*/0, /*size=*/0, ctx.in.riscvTableJumpSection.get()});
s->isUsedInRegularObj = true;
|
|
I tried zephyr with this.. It seems with Without We get a |
lld/ELF/Arch/RISCV.cpp
Outdated
|
|
||
| if (finalizedCMJALTEntries.size() > 0 && | ||
| getSizeReduction() < CMJTSizeReduction) { | ||
| // Stop relax to cm.jalt if there will be the code reduction of cm.jalt is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is very hard to read.
lld/ELF/Arch/RISCV.cpp
Outdated
| return (startCMJALTEntryIdx + finalizedCMJALTEntries.size()) * | ||
| ctx.arg.wordsize; | ||
| return (startCMJTEntryIdx + finalizedCMJTEntries.size()) * ctx.arg.wordsize; | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No else after return
lld/ELF/Arch/RISCV.cpp
Outdated
|
|
||
| auto finalizedVector = tempEntryVector; | ||
| if (tempEntryVector.size() >= maxSize) | ||
| finalizedVector = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be finalizedVector.resize(maxSize)?
Co-authored-by: Craig Topper <[email protected]>
* move jvtAlign to the rest of the variables * improve comment * use resize (with added test)
|
@llvm/pr-subscribers-lld-elf @llvm/pr-subscribers-backend-risc-v Author: Robin Kastberg (RobinKastberg) ChangesThis is a rebase of seemingly abandoned PR #77884 TODO:
Maybe not blocking..?
Patch is 29.68 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/163142.diff 12 Files Affected:
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index dc2ab97e9d9be..255700368fa2a 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -38,6 +38,8 @@ class RISCV final : public TargetInfo {
void writePltHeader(uint8_t *buf) const override;
void writePlt(uint8_t *buf, const Symbol &sym,
uint64_t pltEntryAddr) const override;
+ void writeTableJumpHeader(uint8_t *buf) const override;
+ void writeTableJumpEntry(uint8_t *buf, const uint64_t symbol) const override;
RelType getDynRel(RelType type) const override;
RelExpr getRelExpr(RelType type, const Symbol &s,
const uint8_t *loc) const override;
@@ -70,6 +72,7 @@ class RISCV final : public TargetInfo {
#define INTERNAL_R_RISCV_GPREL_S 257
#define INTERNAL_R_RISCV_X0REL_I 258
#define INTERNAL_R_RISCV_X0REL_S 259
+#define INTERNAL_R_RISCV_TBJAL 260
const uint64_t dtpOffset = 0x800;
@@ -269,6 +272,20 @@ void RISCV::writePlt(uint8_t *buf, const Symbol &sym,
write32le(buf + 12, itype(ADDI, 0, 0, 0));
}
+void RISCV::writeTableJumpHeader(uint8_t *buf) const {
+ if (ctx.arg.is64)
+ write64le(buf, ctx.mainPart->dynamic->getVA());
+ else
+ write32le(buf, ctx.mainPart->dynamic->getVA());
+}
+
+void RISCV::writeTableJumpEntry(uint8_t *buf, const uint64_t address) const {
+ if (ctx.arg.is64)
+ write64le(buf, address);
+ else
+ write32le(buf, address);
+}
+
RelType RISCV::getDynRel(RelType type) const {
return type == ctx.target->symbolicRel ? type
: static_cast<RelType>(R_RISCV_NONE);
@@ -490,6 +507,9 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
return;
}
+ case INTERNAL_R_RISCV_TBJAL:
+ return;
+
case R_RISCV_ADD8:
*loc += val;
return;
@@ -739,6 +759,32 @@ void elf::initSymbolAnchors(Ctx &ctx) {
}
}
+static bool relaxTableJump(Ctx &ctx, const InputSection &sec, size_t i,
+ uint64_t loc, Relocation &r, uint32_t &remove) {
+ if (!ctx.in.riscvTableJumpSection ||
+ !ctx.in.riscvTableJumpSection->isFinalized)
+ return false;
+
+ const uint32_t jalr = read32le(sec.contentMaybeDecompress().data() +
+ r.offset + (r.type == R_RISCV_JAL ? 0 : 4));
+ const uint8_t rd = extractBits(jalr, 11, 7);
+ int tblEntryIndex = -1;
+ if (rd == X_X0) {
+ tblEntryIndex = ctx.in.riscvTableJumpSection->getCMJTEntryIndex(r.sym);
+ } else if (rd == X_RA) {
+ tblEntryIndex = ctx.in.riscvTableJumpSection->getCMJALTEntryIndex(r.sym);
+ }
+
+ if (tblEntryIndex >= 0) {
+ sec.relaxAux->relocTypes[i] = INTERNAL_R_RISCV_TBJAL;
+ sec.relaxAux->writes.push_back(0xA002 |
+ (tblEntryIndex << 2)); // cm.jt or cm.jalt
+ remove = (r.type == R_RISCV_JAL ? 2 : 6);
+ return true;
+ }
+ return false;
+}
+
// Relax R_RISCV_CALL/R_RISCV_CALL_PLT auipc+jalr to c.j, c.jal, or jal.
static void relaxCall(Ctx &ctx, const InputSection &sec, size_t i, uint64_t loc,
Relocation &r, uint32_t &remove) {
@@ -761,6 +807,8 @@ static void relaxCall(Ctx &ctx, const InputSection &sec, size_t i, uint64_t loc,
sec.relaxAux->relocTypes[i] = R_RISCV_RVC_JUMP;
sec.relaxAux->writes.push_back(0x2001); // c.jal
remove = 6;
+ } else if (remove >= 6 && relaxTableJump(ctx, sec, i, loc, r, remove)) {
+ // relaxTableJump sets remove
} else if (remove >= 4 && isInt<21>(displace)) {
sec.relaxAux->relocTypes[i] = R_RISCV_JAL;
sec.relaxAux->writes.push_back(0x6f | rd << 7); // jal
@@ -884,6 +932,11 @@ static bool relax(Ctx &ctx, int pass, InputSection &sec) {
relaxCall(ctx, sec, i, loc, r, remove);
}
break;
+ case R_RISCV_JAL:
+ if (relaxable(relocs, i)) {
+ relaxTableJump(ctx, sec, i, loc, r, remove);
+ }
+ break;
case R_RISCV_TPREL_HI20:
case R_RISCV_TPREL_ADD:
case R_RISCV_TPREL_LO12_I:
@@ -1138,6 +1191,12 @@ void RISCV::finalizeRelax(int passes) const {
case INTERNAL_R_RISCV_X0REL_I:
case INTERNAL_R_RISCV_X0REL_S:
break;
+ case INTERNAL_R_RISCV_TBJAL:
+ assert(ctx.arg.relaxTbljal);
+ assert((aux.writes[writesIdx] & 0xfc03) == 0xA002);
+ skip = 2;
+ write16le(p, aux.writes[writesIdx++]);
+ break;
case R_RISCV_RELAX:
// Used by relaxTlsLe to indicate the relocation is ignored.
break;
@@ -1149,6 +1208,8 @@ void RISCV::finalizeRelax(int passes) const {
skip = 4;
write32le(p, aux.writes[writesIdx++]);
break;
+ case R_RISCV_64:
+ break;
case R_RISCV_32:
// Used by relaxTlsLe to write a uint32_t then suppress the handling
// in relocateAlloc.
@@ -1476,3 +1537,219 @@ void elf::mergeRISCVAttributesSections(Ctx &ctx) {
}
void elf::setRISCVTargetInfo(Ctx &ctx) { ctx.target.reset(new RISCV(ctx)); }
+
+TableJumpSection::TableJumpSection(Ctx &ctx)
+ : SyntheticSection(ctx, ".riscv.jvt", SHT_PROGBITS,
+ SHF_ALLOC | SHF_EXECINSTR, tableAlign) {}
+
+void TableJumpSection::addCMJTEntryCandidate(const Symbol *symbol,
+ int csReduction) {
+ addEntry(symbol, CMJTEntryCandidates, csReduction);
+}
+
+int TableJumpSection::getCMJTEntryIndex(const Symbol *symbol) {
+ uint32_t index = getIndex(symbol, maxCMJTEntrySize, finalizedCMJTEntries);
+ return index < finalizedCMJTEntries.size() ? (int)(startCMJTEntryIdx + index)
+ : -1;
+}
+
+void TableJumpSection::addCMJALTEntryCandidate(const Symbol *symbol,
+ int csReduction) {
+ addEntry(symbol, CMJALTEntryCandidates, csReduction);
+}
+
+int TableJumpSection::getCMJALTEntryIndex(const Symbol *symbol) {
+ uint32_t index = getIndex(symbol, maxCMJALTEntrySize, finalizedCMJALTEntries);
+ return index < finalizedCMJALTEntries.size()
+ ? (int)(startCMJALTEntryIdx + index)
+ : -1;
+}
+
+void TableJumpSection::addEntry(
+ const Symbol *symbol, llvm::DenseMap<const Symbol *, int> &entriesList,
+ int csReduction) {
+ entriesList[symbol] += csReduction;
+}
+
+uint32_t TableJumpSection::getIndex(
+ const Symbol *symbol, uint32_t maxSize,
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ &entriesList) {
+ // Find this symbol in the ordered list of entries if it exists.
+ assert(maxSize >= entriesList.size() &&
+ "Finalized vector of entries exceeds maximum");
+ auto idx = std::find_if(
+ entriesList.begin(), entriesList.end(),
+ [symbol](llvm::detail::DenseMapPair<const Symbol *, int> &e) {
+ return e.first == symbol;
+ });
+
+ if (idx == entriesList.end())
+ return entriesList.size();
+ return idx - entriesList.begin();
+}
+
+void TableJumpSection::scanTableJumpEntries(const InputSection &sec) const {
+ for (auto [i, r] : llvm::enumerate(sec.relocations)) {
+ Defined *definedSymbol = dyn_cast<Defined>(r.sym);
+ if (!definedSymbol)
+ continue;
+ if (i + 1 == sec.relocs().size() ||
+ sec.relocs()[i + 1].type != R_RISCV_RELAX)
+ continue;
+ switch (r.type) {
+ case R_RISCV_JAL:
+ case R_RISCV_CALL:
+ case R_RISCV_CALL_PLT: {
+ const uint32_t jalr =
+ read32le(sec.contentMaybeDecompress().data() + r.offset +
+ (r.type == R_RISCV_JAL ? 0 : 4));
+ const uint8_t rd = extractBits(jalr, 11, 7);
+
+ int csReduction = 6;
+ if (sec.relaxAux->relocTypes[i] == R_RISCV_RVC_JUMP)
+ continue;
+ else if (sec.relaxAux->relocTypes[i] == R_RISCV_JAL)
+ csReduction = 2;
+
+ if (rd == 0)
+ ctx.in.riscvTableJumpSection->addCMJTEntryCandidate(r.sym, csReduction);
+ else if (rd == X_RA)
+ ctx.in.riscvTableJumpSection->addCMJALTEntryCandidate(r.sym,
+ csReduction);
+ }
+ }
+ }
+}
+
+void TableJumpSection::finalizeContents() {
+ if (isFinalized)
+ return;
+ isFinalized = true;
+
+ finalizedCMJTEntries = finalizeEntry(CMJTEntryCandidates, maxCMJTEntrySize);
+ CMJTEntryCandidates.clear();
+ int32_t CMJTSizeReduction = getSizeReduction();
+ finalizedCMJALTEntries =
+ finalizeEntry(CMJALTEntryCandidates, maxCMJALTEntrySize);
+ CMJALTEntryCandidates.clear();
+
+ if (!finalizedCMJALTEntries.empty() &&
+ getSizeReduction() < CMJTSizeReduction) {
+ // In memory, the cm.jt table occupies the first 0x20 entries.
+ // To be able to use the cm.jalt table which comes afterwards
+ // it is necessary to pad out the cm.jt table.
+ // Remove cm.jalt entries if the code reduction of cm.jalt is
+ // smaller than the size of the padding.
+ finalizedCMJALTEntries.clear();
+ }
+ // if table jump still got negative effect, give up.
+ if (getSizeReduction() <= 0) {
+ warn("Table Jump Relaxation didn't got any reduction for code size.");
+ finalizedCMJTEntries.clear();
+ }
+}
+
+// Sort the map in decreasing order of the amount of code reduction provided
+// by the entries. Drop any entries that can't fit in the map from the tail
+// end since they provide less code reduction. Drop any entries that cause
+// an increase in code size (i.e. the reduction from instruction conversion
+// does not cover the code size gain from adding a table entry).
+SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+TableJumpSection::finalizeEntry(llvm::DenseMap<const Symbol *, int> EntryMap,
+ uint32_t maxSize) {
+ auto cmp = [](const llvm::detail::DenseMapPair<const Symbol *, int> &p1,
+ const llvm::detail::DenseMapPair<const Symbol *, int> &p2) {
+ return p1.second > p2.second;
+ };
+
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ tempEntryVector;
+ std::copy(EntryMap.begin(), EntryMap.end(),
+ std::back_inserter(tempEntryVector));
+ std::sort(tempEntryVector.begin(), tempEntryVector.end(), cmp);
+
+ auto finalizedVector = tempEntryVector;
+
+ finalizedVector.resize(maxSize);
+
+ // Drop any items that have a negative effect (i.e. increase code size).
+ while (!finalizedVector.empty()) {
+ if (finalizedVector.rbegin()->second < ctx.arg.wordsize)
+ finalizedVector.pop_back();
+ else
+ break;
+ }
+ return finalizedVector;
+}
+
+size_t TableJumpSection::getSize() const {
+ if (isFinalized) {
+ if (!finalizedCMJALTEntries.empty())
+ return (startCMJALTEntryIdx + finalizedCMJALTEntries.size()) *
+ ctx.arg.wordsize;
+ return (startCMJTEntryIdx + finalizedCMJTEntries.size()) * ctx.arg.wordsize;
+ }
+
+ if (!CMJALTEntryCandidates.empty())
+ return (startCMJALTEntryIdx + CMJALTEntryCandidates.size()) *
+ ctx.arg.wordsize;
+ return (startCMJTEntryIdx + CMJTEntryCandidates.size()) * ctx.arg.wordsize;
+}
+
+int32_t TableJumpSection::getSizeReduction() {
+ // The total reduction in code size is J + JA - JTS - JAE.
+ // Where:
+ // J = number of bytes saved for all the cm.jt instructions emitted
+ // JA = number of bytes saved for all the cm.jalt instructions emitted
+ // JTS = size of the part of the table for cm.jt jumps (i.e. 32 x wordsize)
+ // JAE = number of entries emitted for the cm.jalt jumps x wordsize
+
+ int32_t sizeReduction = -getSize();
+ for (auto entry : finalizedCMJTEntries) {
+ sizeReduction += entry.second;
+ }
+ for (auto entry : finalizedCMJALTEntries) {
+ sizeReduction += entry.second;
+ }
+ return sizeReduction;
+}
+
+void TableJumpSection::writeTo(uint8_t *buf) {
+ if (getSizeReduction() <= 0)
+ return;
+ ctx.target->writeTableJumpHeader(buf);
+ writeEntries(buf + startCMJTEntryIdx * ctx.arg.wordsize,
+ finalizedCMJTEntries);
+ if (finalizedCMJALTEntries.size() > 0) {
+ padWords(buf + ((startCMJTEntryIdx + finalizedCMJTEntries.size()) *
+ ctx.arg.wordsize),
+ startCMJALTEntryIdx);
+ writeEntries(buf + (startCMJALTEntryIdx * ctx.arg.wordsize),
+ finalizedCMJALTEntries);
+ }
+}
+
+void TableJumpSection::padWords(uint8_t *buf, const uint8_t maxWordCount) {
+ for (size_t i = 0; i < maxWordCount; ++i) {
+ if (ctx.arg.is64)
+ write64le(buf + i, 0);
+ else
+ write32le(buf + i, 0);
+ }
+}
+
+void TableJumpSection::writeEntries(
+ uint8_t *buf,
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ &entriesList) {
+ for (const auto &entry : entriesList) {
+ assert(entry.second > 0);
+ // Use the symbol from in.symTab to ensure we have the final adjusted
+ // symbol.
+ if (!entry.first->isDefined())
+ continue;
+ ctx.target->writeTableJumpEntry(buf, entry.first->getVA(ctx, 0));
+ buf += ctx.arg.wordsize;
+ }
+}
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index fd57967a1d21f..58ec6919fb82b 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -67,6 +67,7 @@ class MipsGotSection;
class MipsRldMapSection;
class PPC32Got2Section;
class PPC64LongBranchTargetSection;
+class TableJumpSection;
class PltSection;
class RelocationBaseSection;
class RelroPaddingSection;
@@ -369,6 +370,7 @@ struct Config {
bool resolveGroups;
bool relrGlibc = false;
bool relrPackDynRelocs = false;
+ bool relaxTbljal;
llvm::DenseSet<llvm::StringRef> saveTempsArgs;
llvm::SmallVector<std::pair<llvm::GlobPattern, uint32_t>, 0> shuffleSections;
bool singleRoRx;
@@ -581,6 +583,7 @@ struct InStruct {
std::unique_ptr<RelroPaddingSection> relroPadding;
std::unique_ptr<SyntheticSection> armCmseSGSection;
std::unique_ptr<PPC64LongBranchTargetSection> ppc64LongBranchTarget;
+ std::unique_ptr<TableJumpSection> riscvTableJumpSection;
std::unique_ptr<SyntheticSection> mipsAbiFlags;
std::unique_ptr<MipsGotSection> mipsGot;
std::unique_ptr<SyntheticSection> mipsOptions;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 62f7fffce7dbe..9cac927615839 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1621,6 +1621,7 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
}
ctx.arg.zCombreloc = getZFlag(args, "combreloc", "nocombreloc", true);
ctx.arg.zCopyreloc = getZFlag(args, "copyreloc", "nocopyreloc", true);
+ ctx.arg.relaxTbljal = args.hasArg(OPT_relax_tbljal);
ctx.arg.zForceBti = hasZOption(args, "force-bti");
ctx.arg.zForceIbt = hasZOption(args, "force-ibt");
ctx.arg.zZicfilp = getZZicfilp(ctx, args);
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 0d6dda4b60d3a..3a83f6c36a91c 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -378,6 +378,9 @@ defm use_android_relr_tags: BB<"use-android-relr-tags",
"Use SHT_ANDROID_RELR / DT_ANDROID_RELR* tags instead of SHT_RELR / DT_RELR*",
"Use SHT_RELR / DT_RELR* tags (default)">;
+def relax_tbljal: FF<"relax-tbljal">,
+ HelpText<"Enable conversion of call instructions to table jump instruction from the Zcmt extension for frequently called functions (RISC-V only)">;
+
def pic_veneer: F<"pic-veneer">,
HelpText<"Always generate position independent thunks (veneers)">;
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index bbf4b29a9fda5..e65e1f2c19048 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -4872,6 +4872,16 @@ template <class ELFT> void elf::createSyntheticSections(Ctx &ctx) {
add(*ctx.in.ppc64LongBranchTarget);
}
+ if (ctx.arg.emachine == EM_RISCV && ctx.arg.relaxTbljal) {
+ ctx.in.riscvTableJumpSection = std::make_unique<TableJumpSection>(ctx);
+ add(*ctx.in.riscvTableJumpSection);
+
+ Symbol *s = ctx.symtab->addSymbol(Defined{ctx,
+ /*file=*/nullptr, "__jvt_base$", STB_GLOBAL, STT_NOTYPE, STT_NOTYPE,
+ /*value=*/0, /*size=*/0, ctx.in.riscvTableJumpSection.get()});
+ s->isUsedInRegularObj = true;
+ }
+
ctx.in.gotPlt = std::make_unique<GotPltSection>(ctx);
add(*ctx.in.gotPlt);
ctx.in.igotPlt = std::make_unique<IgotPltSection>(ctx);
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index ac3ec63f0a7a5..03b83cf488f47 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -380,6 +380,53 @@ class GotPltSection final : public SyntheticSection {
SmallVector<const Symbol *, 0> entries;
};
+class TableJumpSection final : public SyntheticSection {
+public:
+ TableJumpSection(Ctx &);
+ size_t getSize() const override;
+ void writeTo(uint8_t *buf) override;
+ void finalizeContents() override;
+
+ int32_t getSizeReduction();
+ void addCMJTEntryCandidate(const Symbol *symbol, int csReduction);
+ int getCMJTEntryIndex(const Symbol *symbol);
+ void addCMJALTEntryCandidate(const Symbol *symbol, int csReduction);
+ int getCMJALTEntryIndex(const Symbol *symbol);
+ void scanTableJumpEntries(const InputSection &sec) const;
+
+ bool isFinalized = false;
+
+private:
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ finalizeEntry(llvm::DenseMap<const Symbol *, int> EntryMap, uint32_t maxSize);
+ void addEntry(const Symbol *symbol,
+ llvm::DenseMap<const Symbol *, int> &entriesList,
+ int csReduction);
+ uint32_t getIndex(const Symbol *symbol, uint32_t maxSize,
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>,
+ 0> &entriesList);
+ void writeEntries(uint8_t *buf,
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>,
+ 0> &entriesList);
+ void padWords(uint8_t *buf, const uint8_t maxWordCount);
+
+ // used in finalizeContents function.
+ static constexpr size_t maxCMJTEntrySize = 32;
+ static constexpr size_t maxCMJALTEntrySize = 224;
+
+ static constexpr size_t startCMJTEntryIdx = 0;
+ static constexpr size_t startCMJALTEntryIdx = 32;
+
+ static constexpr size_t tableAlign = 64;
+
+ llvm::DenseMap<const Symbol *, int> CMJTEntryCandidates;
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ finalizedCMJTEntries;
+ llvm::DenseMap<const Symbol *, int> CMJALTEntryCandidates;
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ finalizedCMJALTEntries;
+};
+
// The IgotPltSection is a Got associated with the PltSection for GNU Ifunc
// Symbols that will be relocated by Target->IRelativeRel.
// On most Targets the IgotPltSection will immediately follow the GotPltSection
diff --git a/lld/ELF/Target.h b/lld/ELF/Target.h
index 9f0605138a4fb..ebb8eae2fea1c 100644
--- a/lld/ELF/Target.h
+++ b/lld/ELF/Target.h
@@ -37,6 +37,9 @@ class TargetInfo {
virtual void writeGotPltHeader(uint8_t *buf) const {}
virtual void writeGotHeader(uint8_t *buf) const {}
virtual void writeGotPlt(uint8_t *buf, const Symbol &s) const {};
+ virtual void writeTableJumpHeader(uint8_t *buf) const {};
+ virtual void writeTableJumpEntry(uint8_t *buf, const uint64_t symbol) const {
+ };
virtual void writeIgotPlt(uint8_t *buf, const Symbol &s) const {}
virtual int64_t getImplicitAddend(const uint8_t *buf, RelType type) const;
virtual int getTlsGdRelaxSkip(RelType type) const { return 1; }
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 4fa80397cbfa7..f8558aab7f372 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -1570,6 +1570,19 @@ template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
changed |= a32p.createFixes();
}
+ if (ctx.arg.relaxTbljal) {
+ if (!changed) {
+ // scan all R_RISCV_JAL, R_RISCV_CALL/R_RISCV_CALL_PLT for RISCV Zcmt
+ // Jump table.
+ for (InputSectionBase *inputSection : ctx.inputSections) {
+ ctx.in.riscvTableJumpSection->scanTableJumpEntries(
+ cast<InputSection>(*inputSection));
+ }
+ ctx.in.riscvTableJumpSection->finalizeContents();
+ changed |= ctx.target->relaxOnce(pass);
+ }
+ }
+
finalizeSynthetic(ctx, ctx.in.got.get());
if (ctx.in.mipsGot)
ctx.in.mipsGot->updateAllocSize(ctx);
diff --git a/lld/test/ELF/riscv-no-tbljal-call.s b/lld/test/ELF/riscv-no-tbljal-call.s
new file mode 100644
index 0000000000000..61d1d87d11057
--- /dev/null
+++ b/lld/test/ELF/riscv-no-tbljal-call.s
@@ -0,0 +1,33 @@
+# REQUIRES: riscv
...
[truncated]
|
|
@llvm/pr-subscribers-lld Author: Robin Kastberg (RobinKastberg) ChangesThis is a rebase of seemingly abandoned PR #77884 TODO:
Maybe not blocking..?
Patch is 29.68 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/163142.diff 12 Files Affected:
diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index dc2ab97e9d9be..255700368fa2a 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -38,6 +38,8 @@ class RISCV final : public TargetInfo {
void writePltHeader(uint8_t *buf) const override;
void writePlt(uint8_t *buf, const Symbol &sym,
uint64_t pltEntryAddr) const override;
+ void writeTableJumpHeader(uint8_t *buf) const override;
+ void writeTableJumpEntry(uint8_t *buf, const uint64_t symbol) const override;
RelType getDynRel(RelType type) const override;
RelExpr getRelExpr(RelType type, const Symbol &s,
const uint8_t *loc) const override;
@@ -70,6 +72,7 @@ class RISCV final : public TargetInfo {
#define INTERNAL_R_RISCV_GPREL_S 257
#define INTERNAL_R_RISCV_X0REL_I 258
#define INTERNAL_R_RISCV_X0REL_S 259
+#define INTERNAL_R_RISCV_TBJAL 260
const uint64_t dtpOffset = 0x800;
@@ -269,6 +272,20 @@ void RISCV::writePlt(uint8_t *buf, const Symbol &sym,
write32le(buf + 12, itype(ADDI, 0, 0, 0));
}
+void RISCV::writeTableJumpHeader(uint8_t *buf) const {
+ if (ctx.arg.is64)
+ write64le(buf, ctx.mainPart->dynamic->getVA());
+ else
+ write32le(buf, ctx.mainPart->dynamic->getVA());
+}
+
+void RISCV::writeTableJumpEntry(uint8_t *buf, const uint64_t address) const {
+ if (ctx.arg.is64)
+ write64le(buf, address);
+ else
+ write32le(buf, address);
+}
+
RelType RISCV::getDynRel(RelType type) const {
return type == ctx.target->symbolicRel ? type
: static_cast<RelType>(R_RISCV_NONE);
@@ -490,6 +507,9 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
return;
}
+ case INTERNAL_R_RISCV_TBJAL:
+ return;
+
case R_RISCV_ADD8:
*loc += val;
return;
@@ -739,6 +759,32 @@ void elf::initSymbolAnchors(Ctx &ctx) {
}
}
+static bool relaxTableJump(Ctx &ctx, const InputSection &sec, size_t i,
+ uint64_t loc, Relocation &r, uint32_t &remove) {
+ if (!ctx.in.riscvTableJumpSection ||
+ !ctx.in.riscvTableJumpSection->isFinalized)
+ return false;
+
+ const uint32_t jalr = read32le(sec.contentMaybeDecompress().data() +
+ r.offset + (r.type == R_RISCV_JAL ? 0 : 4));
+ const uint8_t rd = extractBits(jalr, 11, 7);
+ int tblEntryIndex = -1;
+ if (rd == X_X0) {
+ tblEntryIndex = ctx.in.riscvTableJumpSection->getCMJTEntryIndex(r.sym);
+ } else if (rd == X_RA) {
+ tblEntryIndex = ctx.in.riscvTableJumpSection->getCMJALTEntryIndex(r.sym);
+ }
+
+ if (tblEntryIndex >= 0) {
+ sec.relaxAux->relocTypes[i] = INTERNAL_R_RISCV_TBJAL;
+ sec.relaxAux->writes.push_back(0xA002 |
+ (tblEntryIndex << 2)); // cm.jt or cm.jalt
+ remove = (r.type == R_RISCV_JAL ? 2 : 6);
+ return true;
+ }
+ return false;
+}
+
// Relax R_RISCV_CALL/R_RISCV_CALL_PLT auipc+jalr to c.j, c.jal, or jal.
static void relaxCall(Ctx &ctx, const InputSection &sec, size_t i, uint64_t loc,
Relocation &r, uint32_t &remove) {
@@ -761,6 +807,8 @@ static void relaxCall(Ctx &ctx, const InputSection &sec, size_t i, uint64_t loc,
sec.relaxAux->relocTypes[i] = R_RISCV_RVC_JUMP;
sec.relaxAux->writes.push_back(0x2001); // c.jal
remove = 6;
+ } else if (remove >= 6 && relaxTableJump(ctx, sec, i, loc, r, remove)) {
+ // relaxTableJump sets remove
} else if (remove >= 4 && isInt<21>(displace)) {
sec.relaxAux->relocTypes[i] = R_RISCV_JAL;
sec.relaxAux->writes.push_back(0x6f | rd << 7); // jal
@@ -884,6 +932,11 @@ static bool relax(Ctx &ctx, int pass, InputSection &sec) {
relaxCall(ctx, sec, i, loc, r, remove);
}
break;
+ case R_RISCV_JAL:
+ if (relaxable(relocs, i)) {
+ relaxTableJump(ctx, sec, i, loc, r, remove);
+ }
+ break;
case R_RISCV_TPREL_HI20:
case R_RISCV_TPREL_ADD:
case R_RISCV_TPREL_LO12_I:
@@ -1138,6 +1191,12 @@ void RISCV::finalizeRelax(int passes) const {
case INTERNAL_R_RISCV_X0REL_I:
case INTERNAL_R_RISCV_X0REL_S:
break;
+ case INTERNAL_R_RISCV_TBJAL:
+ assert(ctx.arg.relaxTbljal);
+ assert((aux.writes[writesIdx] & 0xfc03) == 0xA002);
+ skip = 2;
+ write16le(p, aux.writes[writesIdx++]);
+ break;
case R_RISCV_RELAX:
// Used by relaxTlsLe to indicate the relocation is ignored.
break;
@@ -1149,6 +1208,8 @@ void RISCV::finalizeRelax(int passes) const {
skip = 4;
write32le(p, aux.writes[writesIdx++]);
break;
+ case R_RISCV_64:
+ break;
case R_RISCV_32:
// Used by relaxTlsLe to write a uint32_t then suppress the handling
// in relocateAlloc.
@@ -1476,3 +1537,219 @@ void elf::mergeRISCVAttributesSections(Ctx &ctx) {
}
void elf::setRISCVTargetInfo(Ctx &ctx) { ctx.target.reset(new RISCV(ctx)); }
+
+TableJumpSection::TableJumpSection(Ctx &ctx)
+ : SyntheticSection(ctx, ".riscv.jvt", SHT_PROGBITS,
+ SHF_ALLOC | SHF_EXECINSTR, tableAlign) {}
+
+void TableJumpSection::addCMJTEntryCandidate(const Symbol *symbol,
+ int csReduction) {
+ addEntry(symbol, CMJTEntryCandidates, csReduction);
+}
+
+int TableJumpSection::getCMJTEntryIndex(const Symbol *symbol) {
+ uint32_t index = getIndex(symbol, maxCMJTEntrySize, finalizedCMJTEntries);
+ return index < finalizedCMJTEntries.size() ? (int)(startCMJTEntryIdx + index)
+ : -1;
+}
+
+void TableJumpSection::addCMJALTEntryCandidate(const Symbol *symbol,
+ int csReduction) {
+ addEntry(symbol, CMJALTEntryCandidates, csReduction);
+}
+
+int TableJumpSection::getCMJALTEntryIndex(const Symbol *symbol) {
+ uint32_t index = getIndex(symbol, maxCMJALTEntrySize, finalizedCMJALTEntries);
+ return index < finalizedCMJALTEntries.size()
+ ? (int)(startCMJALTEntryIdx + index)
+ : -1;
+}
+
+void TableJumpSection::addEntry(
+ const Symbol *symbol, llvm::DenseMap<const Symbol *, int> &entriesList,
+ int csReduction) {
+ entriesList[symbol] += csReduction;
+}
+
+uint32_t TableJumpSection::getIndex(
+ const Symbol *symbol, uint32_t maxSize,
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ &entriesList) {
+ // Find this symbol in the ordered list of entries if it exists.
+ assert(maxSize >= entriesList.size() &&
+ "Finalized vector of entries exceeds maximum");
+ auto idx = std::find_if(
+ entriesList.begin(), entriesList.end(),
+ [symbol](llvm::detail::DenseMapPair<const Symbol *, int> &e) {
+ return e.first == symbol;
+ });
+
+ if (idx == entriesList.end())
+ return entriesList.size();
+ return idx - entriesList.begin();
+}
+
+void TableJumpSection::scanTableJumpEntries(const InputSection &sec) const {
+ for (auto [i, r] : llvm::enumerate(sec.relocations)) {
+ Defined *definedSymbol = dyn_cast<Defined>(r.sym);
+ if (!definedSymbol)
+ continue;
+ if (i + 1 == sec.relocs().size() ||
+ sec.relocs()[i + 1].type != R_RISCV_RELAX)
+ continue;
+ switch (r.type) {
+ case R_RISCV_JAL:
+ case R_RISCV_CALL:
+ case R_RISCV_CALL_PLT: {
+ const uint32_t jalr =
+ read32le(sec.contentMaybeDecompress().data() + r.offset +
+ (r.type == R_RISCV_JAL ? 0 : 4));
+ const uint8_t rd = extractBits(jalr, 11, 7);
+
+ int csReduction = 6;
+ if (sec.relaxAux->relocTypes[i] == R_RISCV_RVC_JUMP)
+ continue;
+ else if (sec.relaxAux->relocTypes[i] == R_RISCV_JAL)
+ csReduction = 2;
+
+ if (rd == 0)
+ ctx.in.riscvTableJumpSection->addCMJTEntryCandidate(r.sym, csReduction);
+ else if (rd == X_RA)
+ ctx.in.riscvTableJumpSection->addCMJALTEntryCandidate(r.sym,
+ csReduction);
+ }
+ }
+ }
+}
+
+void TableJumpSection::finalizeContents() {
+ if (isFinalized)
+ return;
+ isFinalized = true;
+
+ finalizedCMJTEntries = finalizeEntry(CMJTEntryCandidates, maxCMJTEntrySize);
+ CMJTEntryCandidates.clear();
+ int32_t CMJTSizeReduction = getSizeReduction();
+ finalizedCMJALTEntries =
+ finalizeEntry(CMJALTEntryCandidates, maxCMJALTEntrySize);
+ CMJALTEntryCandidates.clear();
+
+ if (!finalizedCMJALTEntries.empty() &&
+ getSizeReduction() < CMJTSizeReduction) {
+ // In memory, the cm.jt table occupies the first 0x20 entries.
+ // To be able to use the cm.jalt table which comes afterwards
+ // it is necessary to pad out the cm.jt table.
+ // Remove cm.jalt entries if the code reduction of cm.jalt is
+ // smaller than the size of the padding.
+ finalizedCMJALTEntries.clear();
+ }
+ // if table jump still got negative effect, give up.
+ if (getSizeReduction() <= 0) {
+ warn("Table Jump Relaxation didn't got any reduction for code size.");
+ finalizedCMJTEntries.clear();
+ }
+}
+
+// Sort the map in decreasing order of the amount of code reduction provided
+// by the entries. Drop any entries that can't fit in the map from the tail
+// end since they provide less code reduction. Drop any entries that cause
+// an increase in code size (i.e. the reduction from instruction conversion
+// does not cover the code size gain from adding a table entry).
+SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+TableJumpSection::finalizeEntry(llvm::DenseMap<const Symbol *, int> EntryMap,
+ uint32_t maxSize) {
+ auto cmp = [](const llvm::detail::DenseMapPair<const Symbol *, int> &p1,
+ const llvm::detail::DenseMapPair<const Symbol *, int> &p2) {
+ return p1.second > p2.second;
+ };
+
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ tempEntryVector;
+ std::copy(EntryMap.begin(), EntryMap.end(),
+ std::back_inserter(tempEntryVector));
+ std::sort(tempEntryVector.begin(), tempEntryVector.end(), cmp);
+
+ auto finalizedVector = tempEntryVector;
+
+ finalizedVector.resize(maxSize);
+
+ // Drop any items that have a negative effect (i.e. increase code size).
+ while (!finalizedVector.empty()) {
+ if (finalizedVector.rbegin()->second < ctx.arg.wordsize)
+ finalizedVector.pop_back();
+ else
+ break;
+ }
+ return finalizedVector;
+}
+
+size_t TableJumpSection::getSize() const {
+ if (isFinalized) {
+ if (!finalizedCMJALTEntries.empty())
+ return (startCMJALTEntryIdx + finalizedCMJALTEntries.size()) *
+ ctx.arg.wordsize;
+ return (startCMJTEntryIdx + finalizedCMJTEntries.size()) * ctx.arg.wordsize;
+ }
+
+ if (!CMJALTEntryCandidates.empty())
+ return (startCMJALTEntryIdx + CMJALTEntryCandidates.size()) *
+ ctx.arg.wordsize;
+ return (startCMJTEntryIdx + CMJTEntryCandidates.size()) * ctx.arg.wordsize;
+}
+
+int32_t TableJumpSection::getSizeReduction() {
+ // The total reduction in code size is J + JA - JTS - JAE.
+ // Where:
+ // J = number of bytes saved for all the cm.jt instructions emitted
+ // JA = number of bytes saved for all the cm.jalt instructions emitted
+ // JTS = size of the part of the table for cm.jt jumps (i.e. 32 x wordsize)
+ // JAE = number of entries emitted for the cm.jalt jumps x wordsize
+
+ int32_t sizeReduction = -getSize();
+ for (auto entry : finalizedCMJTEntries) {
+ sizeReduction += entry.second;
+ }
+ for (auto entry : finalizedCMJALTEntries) {
+ sizeReduction += entry.second;
+ }
+ return sizeReduction;
+}
+
+void TableJumpSection::writeTo(uint8_t *buf) {
+ if (getSizeReduction() <= 0)
+ return;
+ ctx.target->writeTableJumpHeader(buf);
+ writeEntries(buf + startCMJTEntryIdx * ctx.arg.wordsize,
+ finalizedCMJTEntries);
+ if (finalizedCMJALTEntries.size() > 0) {
+ padWords(buf + ((startCMJTEntryIdx + finalizedCMJTEntries.size()) *
+ ctx.arg.wordsize),
+ startCMJALTEntryIdx);
+ writeEntries(buf + (startCMJALTEntryIdx * ctx.arg.wordsize),
+ finalizedCMJALTEntries);
+ }
+}
+
+void TableJumpSection::padWords(uint8_t *buf, const uint8_t maxWordCount) {
+ for (size_t i = 0; i < maxWordCount; ++i) {
+ if (ctx.arg.is64)
+ write64le(buf + i, 0);
+ else
+ write32le(buf + i, 0);
+ }
+}
+
+void TableJumpSection::writeEntries(
+ uint8_t *buf,
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ &entriesList) {
+ for (const auto &entry : entriesList) {
+ assert(entry.second > 0);
+ // Use the symbol from in.symTab to ensure we have the final adjusted
+ // symbol.
+ if (!entry.first->isDefined())
+ continue;
+ ctx.target->writeTableJumpEntry(buf, entry.first->getVA(ctx, 0));
+ buf += ctx.arg.wordsize;
+ }
+}
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index fd57967a1d21f..58ec6919fb82b 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -67,6 +67,7 @@ class MipsGotSection;
class MipsRldMapSection;
class PPC32Got2Section;
class PPC64LongBranchTargetSection;
+class TableJumpSection;
class PltSection;
class RelocationBaseSection;
class RelroPaddingSection;
@@ -369,6 +370,7 @@ struct Config {
bool resolveGroups;
bool relrGlibc = false;
bool relrPackDynRelocs = false;
+ bool relaxTbljal;
llvm::DenseSet<llvm::StringRef> saveTempsArgs;
llvm::SmallVector<std::pair<llvm::GlobPattern, uint32_t>, 0> shuffleSections;
bool singleRoRx;
@@ -581,6 +583,7 @@ struct InStruct {
std::unique_ptr<RelroPaddingSection> relroPadding;
std::unique_ptr<SyntheticSection> armCmseSGSection;
std::unique_ptr<PPC64LongBranchTargetSection> ppc64LongBranchTarget;
+ std::unique_ptr<TableJumpSection> riscvTableJumpSection;
std::unique_ptr<SyntheticSection> mipsAbiFlags;
std::unique_ptr<MipsGotSection> mipsGot;
std::unique_ptr<SyntheticSection> mipsOptions;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 62f7fffce7dbe..9cac927615839 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1621,6 +1621,7 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
}
ctx.arg.zCombreloc = getZFlag(args, "combreloc", "nocombreloc", true);
ctx.arg.zCopyreloc = getZFlag(args, "copyreloc", "nocopyreloc", true);
+ ctx.arg.relaxTbljal = args.hasArg(OPT_relax_tbljal);
ctx.arg.zForceBti = hasZOption(args, "force-bti");
ctx.arg.zForceIbt = hasZOption(args, "force-ibt");
ctx.arg.zZicfilp = getZZicfilp(ctx, args);
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 0d6dda4b60d3a..3a83f6c36a91c 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -378,6 +378,9 @@ defm use_android_relr_tags: BB<"use-android-relr-tags",
"Use SHT_ANDROID_RELR / DT_ANDROID_RELR* tags instead of SHT_RELR / DT_RELR*",
"Use SHT_RELR / DT_RELR* tags (default)">;
+def relax_tbljal: FF<"relax-tbljal">,
+ HelpText<"Enable conversion of call instructions to table jump instruction from the Zcmt extension for frequently called functions (RISC-V only)">;
+
def pic_veneer: F<"pic-veneer">,
HelpText<"Always generate position independent thunks (veneers)">;
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index bbf4b29a9fda5..e65e1f2c19048 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -4872,6 +4872,16 @@ template <class ELFT> void elf::createSyntheticSections(Ctx &ctx) {
add(*ctx.in.ppc64LongBranchTarget);
}
+ if (ctx.arg.emachine == EM_RISCV && ctx.arg.relaxTbljal) {
+ ctx.in.riscvTableJumpSection = std::make_unique<TableJumpSection>(ctx);
+ add(*ctx.in.riscvTableJumpSection);
+
+ Symbol *s = ctx.symtab->addSymbol(Defined{ctx,
+ /*file=*/nullptr, "__jvt_base$", STB_GLOBAL, STT_NOTYPE, STT_NOTYPE,
+ /*value=*/0, /*size=*/0, ctx.in.riscvTableJumpSection.get()});
+ s->isUsedInRegularObj = true;
+ }
+
ctx.in.gotPlt = std::make_unique<GotPltSection>(ctx);
add(*ctx.in.gotPlt);
ctx.in.igotPlt = std::make_unique<IgotPltSection>(ctx);
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index ac3ec63f0a7a5..03b83cf488f47 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -380,6 +380,53 @@ class GotPltSection final : public SyntheticSection {
SmallVector<const Symbol *, 0> entries;
};
+class TableJumpSection final : public SyntheticSection {
+public:
+ TableJumpSection(Ctx &);
+ size_t getSize() const override;
+ void writeTo(uint8_t *buf) override;
+ void finalizeContents() override;
+
+ int32_t getSizeReduction();
+ void addCMJTEntryCandidate(const Symbol *symbol, int csReduction);
+ int getCMJTEntryIndex(const Symbol *symbol);
+ void addCMJALTEntryCandidate(const Symbol *symbol, int csReduction);
+ int getCMJALTEntryIndex(const Symbol *symbol);
+ void scanTableJumpEntries(const InputSection &sec) const;
+
+ bool isFinalized = false;
+
+private:
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ finalizeEntry(llvm::DenseMap<const Symbol *, int> EntryMap, uint32_t maxSize);
+ void addEntry(const Symbol *symbol,
+ llvm::DenseMap<const Symbol *, int> &entriesList,
+ int csReduction);
+ uint32_t getIndex(const Symbol *symbol, uint32_t maxSize,
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>,
+ 0> &entriesList);
+ void writeEntries(uint8_t *buf,
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>,
+ 0> &entriesList);
+ void padWords(uint8_t *buf, const uint8_t maxWordCount);
+
+ // used in finalizeContents function.
+ static constexpr size_t maxCMJTEntrySize = 32;
+ static constexpr size_t maxCMJALTEntrySize = 224;
+
+ static constexpr size_t startCMJTEntryIdx = 0;
+ static constexpr size_t startCMJALTEntryIdx = 32;
+
+ static constexpr size_t tableAlign = 64;
+
+ llvm::DenseMap<const Symbol *, int> CMJTEntryCandidates;
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ finalizedCMJTEntries;
+ llvm::DenseMap<const Symbol *, int> CMJALTEntryCandidates;
+ SmallVector<llvm::detail::DenseMapPair<const Symbol *, int>, 0>
+ finalizedCMJALTEntries;
+};
+
// The IgotPltSection is a Got associated with the PltSection for GNU Ifunc
// Symbols that will be relocated by Target->IRelativeRel.
// On most Targets the IgotPltSection will immediately follow the GotPltSection
diff --git a/lld/ELF/Target.h b/lld/ELF/Target.h
index 9f0605138a4fb..ebb8eae2fea1c 100644
--- a/lld/ELF/Target.h
+++ b/lld/ELF/Target.h
@@ -37,6 +37,9 @@ class TargetInfo {
virtual void writeGotPltHeader(uint8_t *buf) const {}
virtual void writeGotHeader(uint8_t *buf) const {}
virtual void writeGotPlt(uint8_t *buf, const Symbol &s) const {};
+ virtual void writeTableJumpHeader(uint8_t *buf) const {};
+ virtual void writeTableJumpEntry(uint8_t *buf, const uint64_t symbol) const {
+ };
virtual void writeIgotPlt(uint8_t *buf, const Symbol &s) const {}
virtual int64_t getImplicitAddend(const uint8_t *buf, RelType type) const;
virtual int getTlsGdRelaxSkip(RelType type) const { return 1; }
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 4fa80397cbfa7..f8558aab7f372 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -1570,6 +1570,19 @@ template <class ELFT> void Writer<ELFT>::finalizeAddressDependentContent() {
changed |= a32p.createFixes();
}
+ if (ctx.arg.relaxTbljal) {
+ if (!changed) {
+ // scan all R_RISCV_JAL, R_RISCV_CALL/R_RISCV_CALL_PLT for RISCV Zcmt
+ // Jump table.
+ for (InputSectionBase *inputSection : ctx.inputSections) {
+ ctx.in.riscvTableJumpSection->scanTableJumpEntries(
+ cast<InputSection>(*inputSection));
+ }
+ ctx.in.riscvTableJumpSection->finalizeContents();
+ changed |= ctx.target->relaxOnce(pass);
+ }
+ }
+
finalizeSynthetic(ctx, ctx.in.got.get());
if (ctx.in.mipsGot)
ctx.in.mipsGot->updateAllocSize(ctx);
diff --git a/lld/test/ELF/riscv-no-tbljal-call.s b/lld/test/ELF/riscv-no-tbljal-call.s
new file mode 100644
index 0000000000000..61d1d87d11057
--- /dev/null
+++ b/lld/test/ELF/riscv-no-tbljal-call.s
@@ -0,0 +1,33 @@
+# REQUIRES: riscv
...
[truncated]
|
This patch implements optimizations for the zcmt extension in lld. A new
TableJumpSection has been added. Scans each R_RISCV_CALL/R_RISCV_CALL_PLT
relocType in each section before the linker relaxation, recording the symbol
In finalizeContents the recorded symbol names are sorted in descending order by
the number of jumps. The top symbols are compressed to table jumps during the
relax process.
This is a continuation of PR #77884
Co-authored-by: Craig Topper [email protected]
Co-authored-by: VincentWu [email protected]
Co-authored-by: Scott Egerton [email protected]
===
TODO:
Maybe not blocking..?
R_RISCV_RELAXfrom [RISCV] Mark More Relocs as Relaxable #151422--gc-sectionssee [RISCV][LLD] Zcmt RISC-V extension in lld #163142 (comment)--emit-relocs