Skip to content

Commit cf7513b

Browse files
drodriguezakadutta
authored andcommitted
[llvm-nm] Improve performance while faking symbols from function starts (llvm#162755)
By default `nm` will look into `LC_FUNCTION_STARTS` for binaries that have the flag `MH_NLIST_OUTOFSYNC_WITH_DYLDINFO` set unless `--no-dyldinfo` flag is passed. The implementation that looked for those `LC_FUNCTION_STARTS` in the symbol list was a double nested loop that checked the symbol list over and over again for each of the `LC_FUNCTION_STARTS` entries. For binaries with couple million function starts and hundreds of thousands of symbols, the double nested loop doesn't seem to finish and takes hours even in powerful machines. Instead of the nested loop, exchange time for memory and add all the addresses of the symbols into a set that can be checked then for each of the `LC_FUNCTION_STARTS` very quickly. What took hours and hours and did not seem to finish now takes less than 10 seconds. Fixes llvm#93944
1 parent 3b295a9 commit cf7513b

File tree

1 file changed

+12
-8
lines changed

1 file changed

+12
-8
lines changed

llvm/tools/llvm-nm/llvm-nm.cpp

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
//
1616
//===----------------------------------------------------------------------===//
1717

18+
#include "llvm/ADT/SmallSet.h"
1819
#include "llvm/ADT/StringSwitch.h"
1920
#include "llvm/BinaryFormat/COFF.h"
2021
#include "llvm/BinaryFormat/MachO.h"
@@ -1615,15 +1616,18 @@ static void dumpSymbolsFromDLInfoMachO(MachOObjectFile &MachO,
16151616
}
16161617
// See if these addresses are already in the symbol table.
16171618
unsigned FunctionStartsAdded = 0;
1619+
// The addresses from FoundFns come from LC_FUNCTION_STARTS. Its contents
1620+
// are delta encoded addresses from the start of __TEXT, ending when zero
1621+
// is found. Because of this, the addresses should be unique, and even if
1622+
// we create fake entries on SymbolList in the second loop, SymbolAddresses
1623+
// should not need to be updated there.
1624+
SmallSet<uint64_t, 32> SymbolAddresses;
1625+
for (const auto &S : SymbolList)
1626+
SymbolAddresses.insert(S.Address);
16181627
for (uint64_t f = 0; f < FoundFns.size(); f++) {
1619-
bool found = false;
1620-
for (unsigned J = 0; J < SymbolList.size() && !found; ++J) {
1621-
if (SymbolList[J].Address == FoundFns[f] + BaseSegmentAddress)
1622-
found = true;
1623-
}
1624-
// See this address is not already in the symbol table fake up an
1625-
// nlist for it.
1626-
if (!found) {
1628+
// See if this address is already in the symbol table, otherwise fake up
1629+
// an nlist for it.
1630+
if (!SymbolAddresses.contains(FoundFns[f] + BaseSegmentAddress)) {
16271631
NMSymbol F = {};
16281632
F.Name = "<redacted function X>";
16291633
F.Address = FoundFns[f] + BaseSegmentAddress;

0 commit comments

Comments
 (0)