You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For ELF files with many symbols and/or sections (static libraries, partially
linked files [e.g. vmlinux.o] or large object files), the disassembler is
drastically slowed down by looking up the suitable mapping symbol.
This is caused by the fact that:
- It used an inefficient linear search to find the suitable mapping symbol
- symtab_pos is not always a good hint for forward linear search and
- The symbol table accessible by the disassembler is sorted by address and
then section (not section, then address).
They sometimes force O(n^2) mapping symbol search time while searching for
the suitable mapping symbol for given address.
This commit implements:
- A binary search to look up suitable mapping symbol (O(log(n)) time per
a lookup call, O(n) time on initialization),
- Separate mapping symbol table, sorted by section and then address
(unless the section to disassemble is NULL),
- A very short linear search, even faster than binary search,
when disassembling consecutive addresses (usually traverses only 1 or 2
symbols, O(n) on the worst case but this is only expected on adversarial
samples) and
- Efficient tracking of mapping symbols with ISA string
(by propagating arch field of "$x+(arch)" to succeeding "$x" symbols).
It also changes when the disassembler reuses the last mapping symbol. This
commit only uses the last disassembled address to determine whether the last
mapping symbol should be reused.
This commit doesn't improve the disassembler performance much on regular
programs in general. However, it expects >50% disassembler performance
improvements on some files that "RISC-V: Use faster hash table on
disassembling" was not effective enough.
On bigger libraries, following numbers are observed during the benchmark:
- x 2.13 - 2.22 : Static library : Newlib (libc.a)
- x 5.67 - 6.09 : Static library : GNU libc (libc.a)
- x 11.72 - 12.04 : Shared library : OpenSSL (libcrypto.so)
- x 96.29 : Shared library : LLVM 14 (libLLVM-14.so)
opcodes/ChangeLog:
* disassemble.c (disassemble_free_target): Call new
disassemble_free_riscv function to free the memory.
* disassemble.h (disassemble_free_riscv): Declare.
* riscv-dis.c (struct riscv_mapping_sym): Separate structure to
represent a mapping symbol and ISA string corresponding to it.
(struct riscv_private_data): Add mapping symbol-related fields.
Add is_elf_with_mapsyms.
(last_map_symbol, last_stop_offset): Remove. The role is replaced
by riscv_private_data.{last_mapping_sym,expected_next_addr}.
(from_last_map_symbol): Remove as this is no longer required with
the new design.
(init_riscv_dis_private_data): Initialize new fields. Filter
mapping symbols and make a separate mapping symbol table.
(compare_mapping_syms_without_section): New function to sort
mapping symbols when the current section is NULL.
(compare_mapping_syms_with_section): New function to sort mapping
symbols when the current section is not NULL.
(riscv_propagate_prev_arch_for_mapping_syms): New function to
propagate arch field to succeeding mapping "$x" symbols.
(init_riscv_dis_private_data_for_section): Reset last_mapping_sym.
Sort the mapping symbol table depending on the current section and
propagate arch field.
(riscv_get_map_state): Remove.
(riscv_search_mapping_sym): Do a binary search to update the
mapping state but without reinitializing the architecture here.
(riscv_search_mapping_symbol): Use riscv_search_mapping_sym to
do a optimized lookup. Reuse the last mapping symbol if able.
Use is_elf_with_mapsyms to determine whether the object is an ELF
one with mapping symbols.
(riscv_data_length): Use last_mapping_sym instead of
last_map_symbol.
(print_insn_riscv): Add a comment. Update the architecture if the
suitable mapping symbol in the table has a non-default one.
Update expected_next_addr here.
(disassemble_free_riscv): Free the mapping symbol table.
0 commit comments