Skip to content

Commit 90452a4

Browse files
committed
RISC-V: Reorganize arch-related initialization and management
objdump reuses the disassembler function returned by the disassembler function. That's good and benefits well from various optimizations. However, by default, GDB (default_print_insn in gdb/arch-utils.c) assumes that the disassembler function logic is simple and calls that function for every instruction to be disassembled. This is clearly a waste of time because it probes BFD (ELF information) and re-initializes riscv_rps_dis for every instruction. After the support of mapping symbol with ISA string with commit 40f1a1a ("RISC-V: Output mapping symbols with ISA string."), this kind of per- instruction initialization of riscv_rps_dis can also occur on ELF files with mapping symbol + ISA string (in riscv_get_map_state function). It can be worse if the object is assembled using Binutils commit 0ce50fc ("RISC-V: Always generate mapping symbols at the start of the sections.") or later. To avoid repeated initialization, this commit - Caches the default / override architectures (in two "contexts") and - enables switching between them. riscv_dis_arch_context_t and new utility functions are defined for this purpose. We still have to read the ".riscv.attributes" section on every instruction on GDB but at least the most time-consuming part (updating the actual architecture from the architecture string) is avoided. Likewise, we still have to probe whether we have to update the architecture for every instruction with a mapping symbol with ISA string is suitable but at least we don't actually update the architecture unless necessary (either context itself or the ISA string of the current context is changed). This commit improves the disassembler performance well in those situations: - When long "disas" command is used on GDB - When ELF files with mapping symbols + ISA string is used This commit now implements new mapping symbol handling ("$x" means revert to the previous architecture [with "$x+arch"] if exists, otherwise revert to the default one usually read from an ELF attribute), the recent consensus made by Kito and Nelson. On the benchmark using GDB batch files, the author measured significant performance improvements (35-96% on various big RISC-V programs). Unfortunately, on interactive usecases of GDB, this improvement is rarely observable since we don't usually disassemble such a big chunk at once and the current disassembler is not very slow. On the benchmark using unstripped ELF files with mapping symbols + ISA string "$xrv...", performance improvements are significant and easily observable in the real world (150%-264% performance improvments). Aside from optimization, this commit, along with "RISC-V: Reorganize disassembler state initialization", makes state initialization clearer and makes further changes easier. Also, although not practical in the real world, this commit now allows multi-XLEN object disassembling if the object file has mapping symbols with ISA string and the machine is XLEN-neutral (e.g. objdump with "-m riscv" option). It may help testing Binutils / GAS. opcodes/ChangeLog: * riscv-dis.c (initial_default_arch): Special default architecture string which is handled separately. (riscv_subsets): Remove as it is replaced to a member of the disassembler context. (riscv_dis_arch_context_t): New type to manage RISC-V architecture context for the disassembler. Two instance of this type is defined in this file - "default" and "override". (dis_arch_context_default): New. Architecture context inferred from either an ELF attribute or initial_default_arch. (dis_arch_context_override): New. Architecture context inferred from mapping symbols with ISA string. (dis_arch_context_current): New. A pointer to either dis_arch_context_default or dis_arch_context_override. (riscv_rps_dis): Add summary. Use initial values from dis_arch_context_default. (from_last_map_symbol): Make it file scope to decide whether we should revert the architecture to the default in riscv_get_map_state function. (set_riscv_current_dis_arch_context): New function to update riscv_rps_dis and dis_arch_context_current. (set_riscv_dis_arch_context): New function to update the architecture for the given context. (free_riscv_dis_arch_context): New function to free memory. (update_riscv_dis_xlen): Consider dis_arch_context_current->xlen when guessing correct XLEN. (is_arch_changed): New. Set to true if the architecture is changed. (init_riscv_dis_state_for_arch): New function to track whether the architecture string is changed. (init_riscv_dis_state_for_arch_and_options): Keep track of the architecture string change and update XLEN if it has changed. (update_riscv_dis_arch): New function to set both the architecture and the context. Call initialization functions if needed. (riscv_get_map_state): Add update argument. Keep track of the mapping symbols with ISA string and update the architecture and the context if required. (riscv_search_mapping_symbol): Move from_last_map_symbol to file scope. Call riscv_get_map_state function with architecture and context updates enabled. (riscv_data_length): Call riscv_get_map_state function with architecture and context updates disabled. (riscv_get_disassembler): Add an error handling on Tag_RISCV_arch. Call update_riscv_dis_arch function to update the architecture and the context. (disassemble_free_riscv) Free disassembler context memory. Co-developed-by: Nelson Chu <[email protected]>
1 parent 66d8a9d commit 90452a4

File tree

1 file changed

+165
-23
lines changed

1 file changed

+165
-23
lines changed

opcodes/riscv-dis.c

Lines changed: 165 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,9 @@
3232
#include <stdint.h>
3333
#include <ctype.h>
3434

35+
/* Default architecture string (if not available). */
36+
static const char *const initial_default_arch = "rv64gc";
37+
3538
/* Current XLEN for the disassembler. */
3639
static unsigned xlen = 0;
3740

@@ -48,14 +51,36 @@ static enum riscv_spec_class default_isa_spec = ISA_SPEC_CLASS_DRAFT - 1;
4851
(as specified by the ELF attributes or the `priv-spec' option). */
4952
static enum riscv_spec_class default_priv_spec = PRIV_SPEC_CLASS_NONE;
5053

51-
static riscv_subset_list_t riscv_subsets;
54+
/* RISC-V disassembler architecture context type. */
55+
typedef struct
56+
{
57+
const char *arch_str;
58+
const char *default_arch;
59+
riscv_subset_list_t subsets;
60+
unsigned xlen;
61+
bool no_xlen_if_default;
62+
} riscv_dis_arch_context_t;
63+
64+
/* Context: default (either initial_default_arch or ELF attribute). */
65+
static riscv_dis_arch_context_t dis_arch_context_default
66+
= { NULL, initial_default_arch, {}, 0, true };
67+
68+
/* Context: override (mapping symbols with ISA string). */
69+
static riscv_dis_arch_context_t dis_arch_context_override
70+
= { NULL, NULL, {}, 0, false };
71+
72+
/* Pointer to the current disassembler architecture context. */
73+
static riscv_dis_arch_context_t *dis_arch_context_current
74+
= &dis_arch_context_default;
75+
76+
/* RISC-V ISA string parser structure (current). */
5277
static riscv_parse_subset_t riscv_rps_dis =
5378
{
54-
&riscv_subsets, /* subset_list. */
55-
opcodes_error_handler,/* error_handler. */
56-
&xlen, /* xlen. */
57-
&default_isa_spec, /* isa_spec. */
58-
false, /* check_unknown_prefixed_ext. */
79+
&(dis_arch_context_default.subsets), /* subset_list. */
80+
opcodes_error_handler, /* error_handler. */
81+
&(dis_arch_context_default.xlen), /* xlen. */
82+
&default_isa_spec, /* isa_spec. */
83+
false, /* check_unknown_prefixed_ext. */
5984
};
6085

6186
/* Private data structure for the RISC-V disassembler. */
@@ -74,6 +99,7 @@ static bfd_vma last_stop_offset = 0;
7499
static bfd_vma last_map_symbol_boundary = 0;
75100
static enum riscv_seg_mstate last_map_state = MAP_NONE;
76101
static asection *last_map_section = NULL;
102+
static bool from_last_map_symbol = false;
77103

78104
/* Register names as used by the disassembler. */
79105
static const char * const *riscv_gpr_names;
@@ -86,6 +112,70 @@ static bool no_aliases = false;
86112
static bool is_numeric = false;
87113

88114

115+
/* Set current disassembler context (dis_arch_context_current).
116+
Return true if successfully updated. */
117+
118+
static bool
119+
set_riscv_current_dis_arch_context (riscv_dis_arch_context_t* context)
120+
{
121+
if (context == dis_arch_context_current)
122+
return false;
123+
dis_arch_context_current = context;
124+
riscv_rps_dis.subset_list = &(context->subsets);
125+
riscv_rps_dis.xlen = &(context->xlen);
126+
return true;
127+
}
128+
129+
/* Update riscv_dis_arch_context_t by an ISA string.
130+
Return true if the architecture is updated by arch. */
131+
132+
static bool
133+
set_riscv_dis_arch_context (riscv_dis_arch_context_t *context,
134+
const char *arch)
135+
{
136+
/* Check whether the architecture is changed and
137+
return false if the architecture will not be changed. */
138+
if (context->arch_str)
139+
{
140+
if (context->default_arch && arch == context->default_arch)
141+
return false;
142+
if (strcmp (context->arch_str, arch) == 0)
143+
return false;
144+
}
145+
/* Update architecture string. */
146+
if (context->arch_str != context->default_arch)
147+
free ((void *) context->arch_str);
148+
context->arch_str = (arch != context->default_arch)
149+
? xstrdup (arch)
150+
: context->default_arch;
151+
/* Update other contents (subset list and XLEN). */
152+
riscv_subset_list_t *prev_subsets = riscv_rps_dis.subset_list;
153+
unsigned *prev_xlen = riscv_rps_dis.xlen;
154+
riscv_rps_dis.subset_list = &(context->subsets);
155+
riscv_rps_dis.xlen = &(context->xlen);
156+
context->xlen = 0;
157+
riscv_release_subset_list (&context->subsets);
158+
riscv_parse_subset (&riscv_rps_dis, context->arch_str);
159+
riscv_rps_dis.subset_list = prev_subsets;
160+
riscv_rps_dis.xlen = prev_xlen;
161+
/* Special handling on the default architecture. */
162+
if (context->no_xlen_if_default && arch == context->default_arch)
163+
context->xlen = 0;
164+
return true;
165+
}
166+
167+
/* Free memory allocated for riscv_dis_arch_context_t. */
168+
169+
static void
170+
free_riscv_dis_arch_context (riscv_dis_arch_context_t* context)
171+
{
172+
riscv_release_subset_list (&context->subsets);
173+
if (context->default_arch != context->arch_str)
174+
free ((void *) context->arch_str);
175+
context->arch_str = NULL;
176+
}
177+
178+
89179
/* Guess and update current XLEN. */
90180

91181
static void
@@ -98,9 +188,13 @@ update_riscv_dis_xlen (struct disassemble_info *info)
98188
This is only effective if XLEN-specific BFD machine architecture is
99189
chosen. If XLEN-neutral (like riscv), BFD machine architecture is
100190
ignored on XLEN selection.
101-
2. ELF class in dummy ELF header. */
191+
2. Non-default RISC-V architecture string set by either an ELF
192+
attribute or a mapping symbol with ISA string.
193+
3. ELF class in dummy ELF header. */
102194
if (xlen_by_mach != 0)
103195
xlen = xlen_by_mach;
196+
else if (dis_arch_context_current->xlen != 0)
197+
xlen = dis_arch_context_current->xlen;
104198
else if (xlen_by_elf != 0)
105199
xlen = xlen_by_elf;
106200
else if (info != NULL && info->section != NULL)
@@ -110,18 +204,50 @@ update_riscv_dis_xlen (struct disassemble_info *info)
110204
}
111205
}
112206

207+
/* Initialization (for arch). */
208+
209+
static bool is_arch_changed = false;
210+
211+
static void
212+
init_riscv_dis_state_for_arch (void)
213+
{
214+
is_arch_changed = true;
215+
}
216+
113217
/* Initialization (for arch and options). */
114218

115219
static void
116220
init_riscv_dis_state_for_arch_and_options (void)
117221
{
222+
/* If the architecture string is changed, update XLEN. */
223+
if (is_arch_changed)
224+
update_riscv_dis_xlen (NULL);
118225
/* Set GPR register names to disassemble. */
119226
riscv_gpr_names = is_numeric ? riscv_gpr_names_numeric : riscv_gpr_names_abi;
120227
/* Set FPR register names to disassemble. */
121228
riscv_fpr_names
122229
= !riscv_subset_supports (&riscv_rps_dis, "zfinx")
123230
? (is_numeric ? riscv_fpr_names_numeric : riscv_fpr_names_abi)
124231
: riscv_gpr_names;
232+
/* Save previous options and mark them "unchanged". */
233+
is_arch_changed = false;
234+
}
235+
236+
/* Update architecture for disassembler with its context.
237+
Call initialization functions if either:
238+
- the architecture for current context is changed or
239+
- context is updated to a new one. */
240+
241+
static void
242+
update_riscv_dis_arch (riscv_dis_arch_context_t *context, const char *arch)
243+
{
244+
if ((set_riscv_dis_arch_context (context, arch)
245+
&& dis_arch_context_current == context)
246+
|| set_riscv_current_dis_arch_context (context))
247+
{
248+
init_riscv_dis_state_for_arch ();
249+
init_riscv_dis_state_for_arch_and_options ();
250+
}
125251
}
126252

127253

@@ -897,7 +1023,8 @@ riscv_get_map_state_by_name (const char *name, const char** arch)
8971023
static bool
8981024
riscv_get_map_state (int n,
8991025
enum riscv_seg_mstate *state,
900-
struct disassemble_info *info)
1026+
struct disassemble_info *info,
1027+
bool update)
9011028
{
9021029
const char *name, *arch = NULL;
9031030

@@ -911,12 +1038,25 @@ riscv_get_map_state (int n,
9111038
if (newstate == MAP_NONE)
9121039
return false;
9131040
*state = newstate;
914-
if (arch)
915-
{
916-
riscv_release_subset_list (&riscv_subsets);
917-
riscv_parse_subset (&riscv_rps_dis, arch);
918-
init_riscv_dis_state_for_arch_and_options ();
919-
}
1041+
if (newstate == MAP_INSN && update)
1042+
{
1043+
if (arch)
1044+
{
1045+
/* Override the architecture. */
1046+
update_riscv_dis_arch (&dis_arch_context_override, arch);
1047+
}
1048+
else if (!from_last_map_symbol
1049+
&& set_riscv_current_dis_arch_context (&dis_arch_context_default))
1050+
{
1051+
/* Revert to the default architecture and call init functions if:
1052+
- there's no ISA string in the mapping symbol,
1053+
- mapping symbol is not reused and
1054+
- current disassembler context is changed to the default one.
1055+
This is a shortcut path to avoid full update_riscv_dis_arch. */
1056+
init_riscv_dis_state_for_arch ();
1057+
init_riscv_dis_state_for_arch_and_options ();
1058+
}
1059+
}
9201060
return true;
9211061
}
9221062

@@ -928,7 +1068,6 @@ riscv_search_mapping_symbol (bfd_vma memaddr,
9281068
struct disassemble_info *info)
9291069
{
9301070
enum riscv_seg_mstate mstate;
931-
bool from_last_map_symbol;
9321071
bool found = false;
9331072
int symbol = -1;
9341073
int n;
@@ -976,7 +1115,7 @@ riscv_search_mapping_symbol (bfd_vma memaddr,
9761115
/* We have searched all possible symbols in the range. */
9771116
if (addr > memaddr)
9781117
break;
979-
if (riscv_get_map_state (n, &mstate, info))
1118+
if (riscv_get_map_state (n, &mstate, info, true))
9801119
{
9811120
symbol = n;
9821121
found = true;
@@ -1003,7 +1142,7 @@ riscv_search_mapping_symbol (bfd_vma memaddr,
10031142
if (addr < (info->section ? info->section->vma : 0))
10041143
break;
10051144
/* Stop searching once we find the closed mapping symbol. */
1006-
if (riscv_get_map_state (n, &mstate, info))
1145+
if (riscv_get_map_state (n, &mstate, info, true))
10071146
{
10081147
symbol = n;
10091148
found = true;
@@ -1069,7 +1208,7 @@ riscv_data_length (bfd_vma memaddr,
10691208
{
10701209
bfd_vma addr = bfd_asymbol_value (info->symtab[n]);
10711210
if (addr > memaddr
1072-
&& riscv_get_map_state (n, &m, info))
1211+
&& riscv_get_map_state (n, &m, info, false))
10731212
{
10741213
if (addr - memaddr < length)
10751214
length = addr - memaddr;
@@ -1228,7 +1367,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
12281367
disassembler_ftype
12291368
riscv_get_disassembler (bfd *abfd)
12301369
{
1231-
const char *default_arch = "rv64gc";
1370+
const char *default_arch = initial_default_arch;
12321371

12331372
if (abfd && bfd_get_flavour (abfd) == bfd_target_elf_flavour)
12341373
{
@@ -1244,12 +1383,14 @@ riscv_get_disassembler (bfd *abfd)
12441383
attr[Tag_c].i,
12451384
&default_priv_spec);
12461385
default_arch = attr[Tag_RISCV_arch].s;
1386+
/* For ELF files with (somehow) no architecture string
1387+
in the attributes, use the default value. */
1388+
if (!default_arch)
1389+
default_arch = initial_default_arch;
12471390
}
12481391
}
12491392

1250-
riscv_release_subset_list (&riscv_subsets);
1251-
riscv_parse_subset (&riscv_rps_dis, default_arch);
1252-
init_riscv_dis_state_for_arch_and_options ();
1393+
update_riscv_dis_arch (&dis_arch_context_default, default_arch);
12531394
return print_insn_riscv;
12541395
}
12551396

@@ -1447,5 +1588,6 @@ with the -M switch (multiple options should be separated by commas):\n"));
14471588

14481589
void disassemble_free_riscv (struct disassemble_info *info ATTRIBUTE_UNUSED)
14491590
{
1450-
riscv_release_subset_list (&riscv_subsets);
1591+
free_riscv_dis_arch_context (&dis_arch_context_default);
1592+
free_riscv_dis_arch_context (&dis_arch_context_override);
14511593
}

0 commit comments

Comments
 (0)