Skip to content

Commit 31e4c81

Browse files
committed
RISC-V: Optimized search on mapping symbols
For ELF files with many symbols and/or sections (static libraries, partially linked files [e.g. vmlinux.o] or large object files), the disassembler is drastically slowed down by looking up the suitable mapping symbol. This is caused by the fact that: - It used an inefficient linear search to find the suitable mapping symbol - symtab_pos is not always a good hint for forward linear search and - The symbol table accessible by the disassembler is sorted by address and then section (not section, then address). They sometimes force O(n^2) mapping symbol search time while searching for the suitable mapping symbol for given address. This commit implements: - A binary search to look up suitable mapping symbol (O(log(n)) time per a lookup call, O(n*log(n)) time on initialization), - Separate mapping symbol table, sorted by section and then address (unless the section to disassemble is NULL), - A very short linear search, even faster than binary search, when disassembling consecutive addresses (usually traverses only 1 or 2 symbols, O(n) on the worst case but this is only expected on adversarial samples) and - Efficient tracking of mapping symbols with ISA string (by propagating arch field of "$x+(arch)" to succeeding "$x" symbols). It also changes when the disassembler reuses the last mapping symbol. This commit only uses the last disassembled address to determine whether the last mapping symbol should be reused. This commit doesn't improve the disassembler performance much on regular programs in general. However, it expects >50% disassembler performance improvements on some files that "RISC-V: Use faster hash table on disassembling" was not effective enough. On bigger libraries, following numbers are observed during the benchmark: - x 2.13 - 2.22 : Static library : Newlib (libc.a) - x 5.67 - 6.09 : Static library : GNU libc (libc.a) - x 11.72 - 12.04 : Shared library : OpenSSL (libcrypto.so) - x 96.29 : Shared library : LLVM 14 (libLLVM-14.so) opcodes/ChangeLog: * disassemble.c (disassemble_free_target): Call new disassemble_free_riscv function to free the memory. * disassemble.h (disassemble_free_riscv): Declare. * riscv-dis.c (struct riscv_mapping_sym): Separate structure to represent a mapping symbol and ISA string corresponding to it. (struct riscv_private_data): Add mapping symbol-related fields. Add is_elf_with_mapsyms. (last_map_symbol, last_stop_offset): Remove. The role is replaced by riscv_private_data.{last_mapping_sym,expected_next_addr}. (from_last_map_symbol): Remove as this is no longer required with the new design. (init_riscv_dis_private_data): Initialize new fields. Filter mapping symbols and make a separate mapping symbol table. (compare_mapping_syms_without_section): New function to sort mapping symbols when the current section is NULL. (compare_mapping_syms_with_section): New function to sort mapping symbols when the current section is not NULL. (riscv_propagate_prev_arch_for_mapping_syms): New function to propagate arch field to succeeding mapping "$x" symbols. (init_riscv_dis_private_data_for_section): Reset last_mapping_sym. Sort the mapping symbol table depending on the current section and propagate arch field. (riscv_get_map_state): Remove. (riscv_search_mapping_sym): Do a binary search to update the mapping state but without reinitializing the architecture here. (riscv_search_mapping_symbol): Use riscv_search_mapping_sym to do a optimized lookup. Reuse the last mapping symbol if able. Use is_elf_with_mapsyms to determine whether the object is an ELF one with mapping symbols. (riscv_data_length): Use last_mapping_sym instead of last_map_symbol. (print_insn_riscv): Add a comment. Update the architecture if the suitable mapping symbol in the table has a non-default one. Update expected_next_addr here. (disassemble_free_riscv): Free the mapping symbol table.
1 parent 421754c commit 31e4c81

File tree

3 files changed

+289
-127
lines changed

3 files changed

+289
-127
lines changed

opcodes/disassemble.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -790,6 +790,7 @@ disassemble_free_target (struct disassemble_info *info)
790790
#endif
791791
#ifdef ARCH_riscv
792792
case bfd_arch_riscv:
793+
disassemble_free_riscv (info);
793794
break;
794795
#endif
795796
#ifdef ARCH_rs6000

opcodes/disassemble.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,8 @@ extern disassembler_ftype csky_get_disassembler (bfd *);
105105
extern disassembler_ftype rl78_get_disassembler (bfd *);
106106
extern disassembler_ftype riscv_get_disassembler (bfd *);
107107

108+
extern void disassemble_free_riscv (struct disassemble_info *);
109+
108110
extern void ATTRIBUTE_NORETURN opcodes_assert (const char *, int);
109111

110112
#define OPCODES_ASSERT(x) \

0 commit comments

Comments
 (0)