|
| 1 | +# Binutils RISC-V Generator |
| 2 | + |
| 3 | +This generator creates binutils-compatible opcode table entries from RISC-V UDB instruction definitions, following the format used in `binutils-gdb/opcodes/riscv-opc.c`. |
| 4 | + |
| 5 | +## Generated Files |
| 6 | + |
| 7 | +The generator produces two files for every run: |
| 8 | + |
| 9 | +### 1. Opcode Table (`.c` file) |
| 10 | +- **Format**: `{output_name}.c` |
| 11 | +- **Purpose**: Contains the `riscv_opcodes[]` array with instruction definitions |
| 12 | +- **Structure**: Each entry follows binutils format: `{name, xlen, insn_class, operands, MATCH, MASK, match_func, pinfo}` |
| 13 | +- **Example**: `{"add", 0, INSN_CLASS_I, "d,s,t", MATCH_ADD, MASK_ADD, match_opcode, 0}` |
| 14 | + |
| 15 | +### 2. Header File (`.h` file) |
| 16 | +- **Format**: `{output_name}.h` |
| 17 | +- **Purpose**: Contains `#define` constants and custom instruction class definitions |
| 18 | +- **Contents**: |
| 19 | + - `MATCH_*` constants for instruction matching |
| 20 | + - `MASK_*` constants for instruction masking |
| 21 | + - Custom `INSN_CLASS_*` definitions (commented out for manual addition to binutils) |
| 22 | + |
| 23 | +## Architecture |
| 24 | + |
| 25 | +### Single Source of Truth |
| 26 | +All instruction class mappings are centralized in `insn_class_config.py`: |
| 27 | + |
| 28 | +- **`BUILTIN_CLASSES`**: Maps extensions to existing binutils instruction classes |
| 29 | +- **`BUILTIN_COMBINATIONS`**: Maps complex extension combinations to binutils classes |
| 30 | +- **`is_builtin_class()`**: Determines if a class already exists in binutils |
| 31 | + |
| 32 | +### Extension Mapping |
| 33 | +The `ExtensionMapper` class handles UDB `definedBy` specifications: |
| 34 | + |
| 35 | +- **Simple extensions**: Direct 1:1 mapping (e.g., `Zba` → `INSN_CLASS_ZBA`) |
| 36 | +- **Complex combinations**: |
| 37 | + - `anyOf` → `*_OR_*` classes (e.g., `[Zbb, Zbkb]` → `INSN_CLASS_ZBB_OR_ZBKB`) |
| 38 | + - `allOf` → `*_AND_*` classes (e.g., `[Zcb, Zba]` → `INSN_CLASS_ZCB_AND_ZBA`) |
| 39 | +- **Custom extensions**: Auto-generates class names (e.g., `Zfoo` → `INSN_CLASS_ZFOO`) |
| 40 | + |
| 41 | +### Operand Mapping |
| 42 | +The `OperandMapper` class converts UDB assembly format to binutils operand strings: |
| 43 | + |
| 44 | +- Maps register operands (e.g., `xd` → `d`, `xs1` → `s`) |
| 45 | +- Handles immediate operands (e.g., `imm` → `j`) |
| 46 | +- Marks unknown patterns as `NON_DEFINED_*` |
| 47 | + |
| 48 | +## Usage |
| 49 | + |
| 50 | +### Basic Usage |
| 51 | +```bash |
| 52 | +python3 binutils_generator.py --extensions=I,M,Zba,Zbb --output=my_opcodes.c |
| 53 | +``` |
| 54 | + |
| 55 | +### Command Line Options |
| 56 | +- `--inst-dir`: Directory containing instruction YAML files (default: `../../../spec/std/isa/inst/`) |
| 57 | +- `--output`: Output C file name (corresponding .h file generated automatically) |
| 58 | +- `--extensions`: Comma-separated list of enabled extensions |
| 59 | +- `--arch`: Target architecture (`RV32`, `RV64`, `BOTH`) |
| 60 | +- `--include-all` / `-a`: Include all instructions, ignoring extension filtering |
| 61 | +- `--verbose` / `-v`: Enable verbose logging |
| 62 | + |
| 63 | +### Examples |
| 64 | +```bash |
| 65 | +# Generate for specific extensions |
| 66 | +python3 binutils_generator.py --extensions=I,M,A,F,D --output=rv64_core.c |
| 67 | + |
| 68 | +# Generate all instructions |
| 69 | +python3 binutils_generator.py --include-all --output=complete_riscv.c |
| 70 | + |
| 71 | +# Custom extension |
| 72 | +python3 binutils_generator.py --extensions=I,MyCustomExt --output=custom.c |
| 73 | +``` |
| 74 | + |
| 75 | +## Integration with Binutils |
| 76 | + |
| 77 | +### Adding Custom Instruction Classes |
| 78 | + |
| 79 | +1. **Review generated header file**: Check the "Custom instruction class definitions" section |
| 80 | +2. **Add to binutils enum**: Edit `binutils-gdb/include/opcode/riscv.h` |
| 81 | + ```c |
| 82 | + enum riscv_insn_class |
| 83 | + { |
| 84 | + // ... existing classes ... |
| 85 | + INSN_CLASS_ZFOO, // Add your custom classes here |
| 86 | + INSN_CLASS_I_OR_ZILSD, |
| 87 | + // ... |
| 88 | + }; |
| 89 | + ``` |
| 90 | +
|
| 91 | +3. **Add subset support**: Edit `binutils-gdb/bfd/elfxx-riscv.c` to handle extension requirements |
| 92 | + ```c |
| 93 | + static bool |
| 94 | + riscv_multi_subset_supports (riscv_parse_subset_t *rps, |
| 95 | + enum riscv_insn_class insn_class) |
| 96 | + { |
| 97 | + switch (insn_class) |
| 98 | + { |
| 99 | + // ... existing cases ... |
| 100 | + case INSN_CLASS_ZFOO: |
| 101 | + return riscv_subset_supports (rps, "zfoo"); |
| 102 | + // ... |
| 103 | + } |
| 104 | + } |
| 105 | + ``` |
| 106 | + |
| 107 | +### Adding Generated Opcodes |
| 108 | + |
| 109 | +1. **Include header**: Add `#include "my_opcodes.h"` to your opcode file |
| 110 | +2. **Merge opcode arrays**: |
| 111 | + - Option A: Replace existing `riscv_opcodes[]` in `opcodes/riscv-opc.c` |
| 112 | + - Option B: Create separate opcode table and modify binutils to use it |
| 113 | + - Option C: Append entries to existing table |
| 114 | + |
| 115 | +### File Locations in Binutils |
| 116 | +- **Instruction classes**: `include/opcode/riscv.h` (enum `riscv_insn_class`) |
| 117 | +- **Opcode tables**: `opcodes/riscv-opc.c` (`riscv_opcodes[]` array) |
| 118 | +- **Extension support**: `bfd/elfxx-riscv.c` (`riscv_multi_subset_supports()`) |
| 119 | +- **Operand parsing**: `opcodes/riscv-dis.c` and `gas/config/tc-riscv.c` |
| 120 | + |
| 121 | +## Extending the Generator |
| 122 | + |
| 123 | +### Adding New Extensions |
| 124 | +```python |
| 125 | +from extension_mapper import ExtensionMapper |
| 126 | + |
| 127 | +mapper = ExtensionMapper() |
| 128 | +mapper.add_simple_mapping('Zfoo', 'INSN_CLASS_ZFOO') |
| 129 | +mapper.add_complex_mapping('anyOf', ['Zfoo', 'Zbar'], 'INSN_CLASS_ZFOO_OR_ZBAR') |
| 130 | +``` |
| 131 | + |
| 132 | +### Custom Operand Mappings |
| 133 | +Edit `operand_mapper.py` to add support for new operand patterns. |
| 134 | + |
| 135 | +### Configuration |
| 136 | +Edit `insn_class_config.py` to modify built-in extension mappings. |
| 137 | + |
| 138 | +## Output Statistics |
| 139 | + |
| 140 | +The generator provides detailed statistics: |
| 141 | +- **Total instructions**: Number of instructions processed |
| 142 | +- **Successfully processed**: Instructions with complete mappings |
| 143 | +- **Non-defined operands**: Instructions with unknown operand patterns |
| 144 | +- **Non-defined extensions**: Instructions with unknown extensions (should be 0 with current design) |
| 145 | +- **Custom classes**: Number of custom instruction classes generated |
| 146 | + |
| 147 | +## Validation |
| 148 | + |
| 149 | +Use `validate_output.py` to compare generated output against reference binutils opcodes: |
| 150 | +```bash |
| 151 | +python3 validate_output.py reference.c generated.c |
| 152 | +``` |
| 153 | + |
| 154 | +The validator provides detailed comparison including instruction class, operands, and MATCH/MASK values. |
0 commit comments