Skip to content

Conversation

@a4lg
Copy link
Owner

@a4lg a4lg commented Nov 17, 2022

@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch from 27bf225 to fa6aaa9 Compare November 18, 2022 00:12
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 444fbaa to b83e54a Compare November 18, 2022 04:58
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch from fa6aaa9 to 90cbce7 Compare November 18, 2022 04:58
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from b83e54a to 9fdf0f8 Compare November 19, 2022 02:38
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 3 times, most recently from ff6d0c3 to 8f73fe2 Compare November 19, 2022 11:57
@a4lg a4lg added the enhancement New feature or request label Nov 19, 2022
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch from 8f73fe2 to d41edfb Compare November 20, 2022 01:02
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 844db36 to f3378df Compare November 21, 2022 02:21
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 2 times, most recently from 3af33e2 to a7983af Compare November 22, 2022 01:32
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from a452e06 to 515d02e Compare November 22, 2022 09:01
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 2 times, most recently from 33a34e5 to 31e4c81 Compare November 23, 2022 03:04
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 29236db to 3ddff53 Compare November 23, 2022 08:44
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 2 times, most recently from 176ede7 to e887d6f Compare November 24, 2022 03:38
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 29c84f1 to e6a6657 Compare November 24, 2022 05:29
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 2 times, most recently from e1402d5 to a9eeb46 Compare November 24, 2022 09:07
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 521e646 to 8c9fc20 Compare November 25, 2022 00:41
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 4 times, most recently from f1b6ec2 to 412f6fb Compare November 26, 2022 00:28
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 0e92186 to e679d61 Compare November 27, 2022 09:10
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 2 times, most recently from 8b63344 to afff826 Compare November 27, 2022 09:42
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 5a6b1a0 to 524c00d Compare November 28, 2022 01:04
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch from 4d1b2e2 to 89d9259 Compare August 3, 2023 00:07
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from e1a793e to 97be487 Compare August 3, 2023 03:09
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch from 89d9259 to 0503ae0 Compare August 3, 2023 03:09
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 97be487 to 5da6585 Compare August 3, 2023 05:59
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 2 times, most recently from 19b5c6a to fba1e6e Compare August 6, 2023 01:46
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch 2 times, most recently from f8d4788 to d08c88a Compare August 7, 2023 01:23
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 5 times, most recently from 556828e to b8f7218 Compare August 11, 2023 01:53
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 6921cd1 to ebdb18b Compare August 11, 2023 03:56
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 3 times, most recently from 33a5cee to f298147 Compare August 15, 2023 07:27
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 491c911 to f1b6d8a Compare September 3, 2023 02:26
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch 4 times, most recently from 6de3f96 to 846b59c Compare September 7, 2023 09:35
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 37af5cd to 84734db Compare October 16, 2023 09:05
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch from 846b59c to 6643949 Compare October 16, 2023 09:05
@a4lg a4lg force-pushed the riscv-dis-opt1-1-hashtable-and-caching branch from 84734db to 30b8505 Compare October 19, 2023 03:17
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch from 6643949 to 057a7fc Compare October 19, 2023 03:17
a4lg added 3 commits October 19, 2023 06:58
Before further optimization, we can optimize the function
riscv_search_mapping_symbol a bit for clarity.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_search_mapping_symbol): Make MAP_INSN default
	considering major usecases.  Remove setting found here as no one
	uses the value after setting this.  memaddr cannot be negative
	so simplify and change comment.

Idea-by: Nelson Chu <[email protected]>
This is one more preparation for mapping symbol optimization.  It adds a
separate function that is called when the section to disassemble is changed.

This commit enables tracking per-section state management required for the
next optimization ("RISC-V: Optimized search on mapping symbols").

opcodes/ChangeLog:

	* riscv-dis.c (struct riscv_private_data): Add last_section.
	(init_riscv_dis_private_data): Initialize last_section.
	(init_riscv_dis_private_data_for_section): New function. update
	last_section here.
	(print_insn_riscv): Track section changes.
For ELF files with many symbols and/or sections (static libraries, partially
linked files [e.g. vmlinux.o] or large object files), the disassembler is
drastically slowed down by looking up the suitable mapping symbol.

This is caused by the fact that:

-   It used an inefficient linear search to find the suitable mapping symbol
-   symtab_pos is not always a good hint for forward linear search and
-   The symbol table accessible by the disassembler is sorted by address and
    then section (not section, then address).

They sometimes force O(n^2) mapping symbol search time while searching for
the suitable mapping symbol for given address.

This commit implements:

-   A binary search to look up suitable mapping symbol (O(log(n)) time per
    a lookup call, O(m + n*log(n)) time on initialization where n < m),
-   Separate mapping symbol table, sorted by section and then address
    (unless the section to disassemble is NULL),
-   A very short linear search, even faster than binary search,
    when disassembling consecutive addresses (usually traverses only 1 or 2
    symbols, O(n) on the worst case but this is only expected on adversarial
    samples) and
-   Efficient tracking of mapping symbols with ISA string
    (by propagating arch field of "$x+(arch)" to succeeding "$x" symbols).

It also changes when the disassembler reuses the last mapping symbol.  This
commit only uses the last disassembled address to determine whether the last
mapping symbol should be reused.

This commit doesn't improve the disassembler performance much on regular
programs in general.  However, it expects >50% disassembler performance
improvements on some files that "RISC-V: Use faster hash table on
disassembling" was not effective enough.

opcodes/ChangeLog:

	* disassemble.c (disassemble_free_target): Call new
	disassemble_free_riscv function to free the memory.
	* disassemble.h (disassemble_free_riscv): Declare.
	* riscv-dis.c (struct riscv_mapping_sym): Separate structure to
	represent a mapping symbol and ISA string corresponding to it.
	(struct riscv_private_data): Add mapping symbol-related fields.
	Add is_elf_with_mapsyms.
	(last_map_symbol, last_stop_offset): Remove.  The role is replaced
	by riscv_private_data.{last_mapping_sym,expected_next_addr}.
	(from_last_map_symbol): Remove as this is no longer required with
	the new design.
	(init_riscv_dis_private_data): Initialize new fields.  Filter
	mapping symbols and make a separate mapping symbol table.
	(compare_mapping_syms_without_section): New function to sort
	mapping symbols when the current section is NULL.
	(compare_mapping_syms_with_section): New function to sort mapping
	symbols when the current section is not NULL.
	(riscv_propagate_prev_arch_for_mapping_syms): New function to
	propagate arch field to succeeding mapping "$x" symbols.
	(init_riscv_dis_private_data_for_section): Reset last_mapping_sym.
	Sort the mapping symbol table depending on the current section and
	propagate arch field.
	(riscv_get_map_state): Remove.
	(riscv_search_mapping_sym): Do a binary search to update the
	mapping state but without reinitializing the architecture here.
	(riscv_search_mapping_symbol): Use riscv_search_mapping_sym to
	do a optimized lookup.  Reuse the last mapping symbol if able.
	Use is_elf_with_mapsyms to determine whether the object is an ELF
	one with mapping symbols.
	(riscv_data_length): Use last_mapping_sym instead of
	last_map_symbol.
	(print_insn_riscv): Add a comment.  Update the architecture if the
	suitable mapping symbol in the table has a non-default one.
	Update expected_next_addr here.
	(disassemble_free_riscv): Free the mapping symbol table.
@a4lg a4lg force-pushed the riscv-dis-opt1-2-mapping branch from 057a7fc to c77bc7e Compare October 19, 2023 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants