Skip to content

Commit 87e0992

Browse files
committed
feat(backends): add binutils generator for RISC-V opcode tables
Implement a comprehensive binutils generator that creates binutils-compatible opcode table entries from RISC-V UDB instruction definitions. The generator produces C source files and headers following the binutils-gdb format used in opcodes/riscv-opc.c. Key features: - Extension-aware instruction class mapping with built-in and custom support - Operand format conversion from UDB assembly syntax to binutils format - MATCH/MASK constant generation with proper bit manipulation - Configurable architecture support (RV32, RV64, BOTH) - Comprehensive documentation and integration guidelines Components: - binutils_generator.py: Main generator with CLI interface - binutils_parser.py: Core parsing and conversion logic - extract_riscv_operand_bits.py: Operand bit extraction utilities - insn_class_config.py: Extension to instruction class mappings - naming_config.py: User-defined naming and preference configurations - README.md: Comprehensive documentation and usage examples Signed-off-by: Afonso Oliveira <[email protected]>
1 parent b5d25b9 commit 87e0992

File tree

6 files changed

+1303
-0
lines changed

6 files changed

+1303
-0
lines changed
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
# Binutils RISC-V Generator
2+
3+
This generator creates binutils-compatible opcode table entries from RISC-V UDB instruction definitions, following the format used in `binutils-gdb/opcodes/riscv-opc.c`.
4+
5+
## Generated Files
6+
7+
The generator produces two files for every run:
8+
9+
### 1. Opcode Table (`.c` file)
10+
- **Format**: `{output_name}.c`
11+
- **Purpose**: Contains the `riscv_opcodes[]` array with instruction definitions
12+
- **Structure**: Each entry follows binutils format: `{name, xlen, insn_class, operands, MATCH, MASK, match_func, pinfo}`
13+
- **Example**: `{"add", 0, INSN_CLASS_I, "d,s,t", MATCH_ADD, MASK_ADD, match_opcode, 0}`
14+
15+
### 2. Header File (`.h` file)
16+
- **Format**: `{output_name}.h`
17+
- **Purpose**: Contains `#define` constants and custom instruction class definitions
18+
- **Contents**:
19+
- `MATCH_*` constants for instruction matching
20+
- `MASK_*` constants for instruction masking
21+
- Custom `INSN_CLASS_*` definitions (commented out for manual addition to binutils)
22+
23+
## Architecture
24+
25+
### Single Source of Truth
26+
All instruction class mappings are centralized in `insn_class_config.py`:
27+
28+
- **`BUILTIN_CLASSES`**: Maps extensions to existing binutils instruction classes
29+
- **`BUILTIN_COMBINATIONS`**: Maps complex extension combinations to binutils classes
30+
- **`is_builtin_class()`**: Determines if a class already exists in binutils
31+
32+
### Extension Mapping
33+
The `ExtensionMapper` class handles UDB `definedBy` specifications:
34+
35+
- **Simple extensions**: Direct 1:1 mapping (e.g., `Zba``INSN_CLASS_ZBA`)
36+
- **Complex combinations**:
37+
- `anyOf``*_OR_*` classes (e.g., `[Zbb, Zbkb]``INSN_CLASS_ZBB_OR_ZBKB`)
38+
- `allOf``*_AND_*` classes (e.g., `[Zcb, Zba]``INSN_CLASS_ZCB_AND_ZBA`)
39+
- **Custom extensions**: Auto-generates class names (e.g., `Zfoo``INSN_CLASS_ZFOO`)
40+
41+
### Operand Mapping
42+
The `OperandMapper` class converts UDB assembly format to binutils operand strings:
43+
44+
- Maps register operands (e.g., `xd``d`, `xs1``s`)
45+
- Handles immediate operands (e.g., `imm``j`)
46+
- Marks unknown patterns as `NON_DEFINED_*`
47+
48+
## Usage
49+
50+
### Basic Usage
51+
```bash
52+
python3 binutils_generator.py --extensions=I,M,Zba,Zbb --output=my_opcodes.c
53+
```
54+
55+
### Command Line Options
56+
- `--inst-dir`: Directory containing instruction YAML files (default: `../../../spec/std/isa/inst/`)
57+
- `--output`: Output C file name (corresponding .h file generated automatically)
58+
- `--extensions`: Comma-separated list of enabled extensions
59+
- `--arch`: Target architecture (`RV32`, `RV64`, `BOTH`)
60+
- `--include-all` / `-a`: Include all instructions, ignoring extension filtering
61+
- `--verbose` / `-v`: Enable verbose logging
62+
63+
### Examples
64+
```bash
65+
# Generate for specific extensions
66+
python3 binutils_generator.py --extensions=I,M,A,F,D --output=rv64_core.c
67+
68+
# Generate all instructions
69+
python3 binutils_generator.py --include-all --output=complete_riscv.c
70+
71+
# Custom extension
72+
python3 binutils_generator.py --extensions=I,MyCustomExt --output=custom.c
73+
```
74+
75+
## Integration with Binutils
76+
77+
### Adding Custom Instruction Classes
78+
79+
1. **Review generated header file**: Check the "Custom instruction class definitions" section
80+
2. **Add to binutils enum**: Edit `binutils-gdb/include/opcode/riscv.h`
81+
```c
82+
enum riscv_insn_class
83+
{
84+
// ... existing classes ...
85+
INSN_CLASS_ZFOO, // Add your custom classes here
86+
INSN_CLASS_I_OR_ZILSD,
87+
// ...
88+
};
89+
```
90+
91+
3. **Add subset support**: Edit `binutils-gdb/bfd/elfxx-riscv.c` to handle extension requirements
92+
```c
93+
static bool
94+
riscv_multi_subset_supports (riscv_parse_subset_t *rps,
95+
enum riscv_insn_class insn_class)
96+
{
97+
switch (insn_class)
98+
{
99+
// ... existing cases ...
100+
case INSN_CLASS_ZFOO:
101+
return riscv_subset_supports (rps, "zfoo");
102+
// ...
103+
}
104+
}
105+
```
106+
107+
### Adding Generated Opcodes
108+
109+
1. **Include header**: Add `#include "my_opcodes.h"` to your opcode file
110+
2. **Merge opcode arrays**:
111+
- Option A: Replace existing `riscv_opcodes[]` in `opcodes/riscv-opc.c`
112+
- Option B: Create separate opcode table and modify binutils to use it
113+
- Option C: Append entries to existing table
114+
115+
### File Locations in Binutils
116+
- **Instruction classes**: `include/opcode/riscv.h` (enum `riscv_insn_class`)
117+
- **Opcode tables**: `opcodes/riscv-opc.c` (`riscv_opcodes[]` array)
118+
- **Extension support**: `bfd/elfxx-riscv.c` (`riscv_multi_subset_supports()`)
119+
- **Operand parsing**: `opcodes/riscv-dis.c` and `gas/config/tc-riscv.c`
120+
121+
## Extending the Generator
122+
123+
### Adding New Extensions
124+
```python
125+
from extension_mapper import ExtensionMapper
126+
127+
mapper = ExtensionMapper()
128+
mapper.add_simple_mapping('Zfoo', 'INSN_CLASS_ZFOO')
129+
mapper.add_complex_mapping('anyOf', ['Zfoo', 'Zbar'], 'INSN_CLASS_ZFOO_OR_ZBAR')
130+
```
131+
132+
### Custom Operand Mappings
133+
Edit `operand_mapper.py` to add support for new operand patterns.
134+
135+
### Configuration
136+
Edit `insn_class_config.py` to modify built-in extension mappings.
137+
138+
## Output Statistics
139+
140+
The generator provides detailed statistics:
141+
- **Total instructions**: Number of instructions processed
142+
- **Successfully processed**: Instructions with complete mappings
143+
- **Non-defined operands**: Instructions with unknown operand patterns
144+
- **Non-defined extensions**: Instructions with unknown extensions (should be 0 with current design)
145+
- **Custom classes**: Number of custom instruction classes generated
146+
147+
## Validation
148+
149+
Use `validate_output.py` to compare generated output against reference binutils opcodes:
150+
```bash
151+
python3 validate_output.py reference.c generated.c
152+
```
153+
154+
The validator provides detailed comparison including instruction class, operands, and MATCH/MASK values.

0 commit comments

Comments
 (0)