|
| 1 | +# `elf2bin`: convert ELF images to binary or hex via their segment view |
| 2 | + |
| 3 | +`elf2bin` is a tool for converting ELF images to binary or hex |
| 4 | +formats. |
| 5 | + |
| 6 | +It does a similar job to `llvm-objcopy` with binary or hex output, but |
| 7 | +unlike `llvm-objcopy`, it exclusively uses the ELF program header |
| 8 | +table, ignoring the sections. So it can cope with ELF images which |
| 9 | +have no section header table at all, or images whose sections don't |
| 10 | +exactly match up to the segments, e.g. have a gap in a segment which |
| 11 | +no section covers. Also, it supports a wider range of binary and hex |
| 12 | +output options. |
| 13 | + |
| 14 | +The feature set of `elf2bin` is similar to the feature set of the |
| 15 | +`fromelf` tool shipped as part of the proprietary Arm Compiler 6 |
| 16 | +toolchain, although the detailed syntax is different. Users migrating |
| 17 | +from that toolchain should find that `elf2bin` will support similar |
| 18 | +use cases. |
| 19 | + |
| 20 | +(However, `elf2bin` is focused on binary and hex output, and does not |
| 21 | +support the other modes of `fromelf`, such as converting one ELF file |
| 22 | +to another, or generating diagnostic dumps and disassembly. For that |
| 23 | +functionality, use LLVM supporting tools such as `llvm-objcopy`, |
| 24 | +`llvm-objdump`, `llvm-nm` and `llvm-size`.) |
| 25 | + |
| 26 | +## Using `elf2bin` |
| 27 | + |
| 28 | +The general format of an `elf2bin` command involves: |
| 29 | + |
| 30 | +* An output mode option, telling `elf2bin` what kind of binary or hex |
| 31 | + output it's generating. |
| 32 | + |
| 33 | +* One or more input file names, which must be ELF images or dynamic |
| 34 | + libraries. |
| 35 | + |
| 36 | +* Either a single output file name, or a pattern that tells `elf2bin` |
| 37 | + how to name multiple output files. |
| 38 | + |
| 39 | +* Optionally, other options to adjust behavior. |
| 40 | + |
| 41 | +### Output modes |
| 42 | + |
| 43 | +This section lists the available options to set the type of output file. |
| 44 | + |
| 45 | +#### `--ihex`: Intel hex format |
| 46 | + |
| 47 | +The Intel hex format is a record-based format. Each data record states |
| 48 | +the address it is expected to be loaded at. So a single output file |
| 49 | +can specify two or more segments at widely separated addresses without |
| 50 | +having to include all the space in between. |
| 51 | + |
| 52 | +Each line of an Intel hex file begins with a `:`. (This makes it easy |
| 53 | +to tell apart from the Motorola format which starts lines with `S`.) |
| 54 | + |
| 55 | +The Intel hex format allows addresses to be specified in the form of a |
| 56 | +32-bit linear address, or in an 8086-style segment:offset pair. Like |
| 57 | +`fromelf` (and unlike GNU `objdump`), `elf2bin` always uses the linear |
| 58 | +address option, so that its hex files are as easy as possible to |
| 59 | +interpret. |
| 60 | + |
| 61 | +There is no version of the Intel hex format that supports 64-bit |
| 62 | +addresses. `elf2bin` will give an error if a 64-bit input file |
| 63 | +specifies data to be loaded at an address that does not fit in 32 |
| 64 | +bits. |
| 65 | + |
| 66 | +#### `--srec`: Motorola hex format |
| 67 | + |
| 68 | +The Motorola hex format is similar in concept to the Intel one: each |
| 69 | +data record specifies an address and some data to load at that address. |
| 70 | + |
| 71 | +Each line of a Motorola hex file begins with an `S`. (This makes it |
| 72 | +easy to tell apart from the Intel format which starts lines with `:`.) |
| 73 | + |
| 74 | +In the Motorola format, there are multiple record types which store |
| 75 | +addresses in 16-bit, 24-bit or 32-bit format. `elf2bin` keeps its |
| 76 | +output as simple and consistent as possible, by always using the |
| 77 | +32-bit record types (`S3` and `S7`). |
| 78 | + |
| 79 | +#### `--bin`: one binary file per segment |
| 80 | + |
| 81 | +The `--bin` option writes each loadable segment into a raw binary |
| 82 | +file, containing the bytes of data in the segment and nothing else. |
| 83 | + |
| 84 | +If there is more than one loadable segment, then you must use `-O` to |
| 85 | +specify a pattern for the output file names, instead of `-o` to |
| 86 | +specify a single output file name. |
| 87 | + |
| 88 | +#### `--bincombined`: one single binary file |
| 89 | + |
| 90 | +The `--bincombined` mode writes out a _single_ binary file, which |
| 91 | +contains all the loadable segments in the image, with padding between |
| 92 | +them to put them at the correct relative offsets from each other. |
| 93 | + |
| 94 | +The resulting file is suitable for loading at the base address of the |
| 95 | +first segment in memory. |
| 96 | + |
| 97 | +(You can adjust the base address further downwards with `--base`, |
| 98 | +which adds padding before the first segment.) |
| 99 | + |
| 100 | +#### `--vhx` and `--vhxcombined`: Verilog hex format |
| 101 | + |
| 102 | +The Verilog hex format is a translation of a binary file into hex, by |
| 103 | +turning each binary byte into a two-digit hex number on a line by |
| 104 | +itself. |
| 105 | + |
| 106 | +So, unlike the Intel and Motorola hex formats, there is no data inside |
| 107 | +the file that specifies the address to load data at. |
| 108 | + |
| 109 | +`--vhx` behaves similarly to `--bin`: it outputs one hex file per |
| 110 | +loadable segment. `--vhxcombined` behaves similarly to |
| 111 | +`--bincombined`: it outputs a single hex file containing all the |
| 112 | +segments, with padding between them if necessary. |
| 113 | + |
| 114 | +### Output file naming |
| 115 | + |
| 116 | +If `elf2bin` is writing a single output file, you can use the `-o` (or |
| 117 | +`--output-file`) option to tell it the name of the file, e.g. |
| 118 | + |
| 119 | +``` |
| 120 | +elf2bin --srec -o output.hex input.elf |
| 121 | +``` |
| 122 | + |
| 123 | +But in many situations `elf2bin` will produce multiple output files: |
| 124 | + |
| 125 | +* because you gave it multiple input files |
| 126 | +* because you used `--bin` or `--vhx` on a file with multiple segments |
| 127 | +* because you used `--banks` to split binary output into interleaved bank files |
| 128 | +* more than one of the above |
| 129 | + |
| 130 | +In that case, using `-o` will produce an error, because `elf2bin` will |
| 131 | +notice that you've asked it to write more than one output file to the |
| 132 | +same location. Instead, you must use the `-O` or `--output-pattern` |
| 133 | +option to provide a _pattern_ for constructing each output file name. |
| 134 | + |
| 135 | +Patterns look a bit like `printf` format strings: they consist of |
| 136 | +literal characters interleaved with formatting directives introduced |
| 137 | +by `%`. The available formatting directives are: |
| 138 | + |
| 139 | +* `%f` expands to the base name of the input file, with directory path |
| 140 | + and file extension removed. For example, if an input file is called |
| 141 | + `foo/bar/baz.elf`, then `%f` will expand to just `baz` when |
| 142 | + generating output from that file. |
| 143 | + |
| 144 | +* `%F` expands to the _full_ name of the input file, with the |
| 145 | + directory path still removed, but the extension left on. For |
| 146 | + example, `foo/bar/baz.elf` will turn into `baz.elf`. |
| 147 | + |
| 148 | +* `%a` and `%A` expand to the base address of a particular ELF |
| 149 | + segment. These are for the `--bin` or `--vhx` modes, where each |
| 150 | + segment is output to a separate file. ELF contains no way to assign |
| 151 | + segments a human-readable name, so the base address is the simplest |
| 152 | + way to distinguish them. The address is generated in hex, with no |
| 153 | + leading zeroes (unless it's actually `0`). `%a` generates hex digits |
| 154 | + `a`-`f` in lower case, and `%A` generates them in upper case. |
| 155 | + |
| 156 | +* `%b` expands to the bank number, if you're using `--banks` to split |
| 157 | + binary (or VHX) output into more than one bank. Banks are numbered |
| 158 | + consecutively upwards from 0, and are written in decimal. |
| 159 | + |
| 160 | +* `%%` expands to a literal `%`, if you need one in the output file |
| 161 | + name. |
| 162 | + |
| 163 | +Some examples: |
| 164 | + |
| 165 | +``` |
| 166 | +elf2bin --ihex -O %f.hex one.elf two.elf # generates one.hex and two.hex |
| 167 | +elf2bin --ihex -O %F.hex one.elf two.elf # generates one.elf.hex and two.elf.hex |
| 168 | +elf2bin --bin -O out-%a.bin input.elf # might generate, say, out-0.bin and out-f000.bin |
| 169 | +elf2bin --bin -O out-%A.bin input.elf # same but you'd get out-F000.bin |
| 170 | +elf2bin --bincombined --banks 1x2 out-%b.bin input.elf # out-0.bin and out-1.bin |
| 171 | +elf2bin --srec -O out-%%.hex input.elf # just gives out-%.hex |
| 172 | +``` |
| 173 | + |
| 174 | +In all cases, `elf2bin` will check its set of output files to ensure |
| 175 | +you haven't tried to direct two output files to the same name. |
| 176 | + |
| 177 | +In a complex case, you may need to use more than one of these |
| 178 | +directives. For example, if you're using `--bin` with multiple ELF |
| 179 | +files at once, some of which have multiple segments, _and_ you're |
| 180 | +using bank interleaving, then you'll need to use all of `%f`, `%a` (or |
| 181 | +`%A`) and `%b` to generate a distinct name for each output file: |
| 182 | + |
| 183 | +``` |
| 184 | +elf2bin --bin --banks 2x4 -O %f-%a-%b.bin one.elf two.elf |
| 185 | +``` |
| 186 | + |
| 187 | +### Other options |
| 188 | + |
| 189 | +#### `--base`: set the base address of a combined output file |
| 190 | + |
| 191 | +If you're using the `--bincombined` or `--vhxcombined` output modes, |
| 192 | +you can use the `--base` option to specify the address you want the |
| 193 | +output file to begin at. |
| 194 | + |
| 195 | +If this is lower than the start address of any segment, `elf2bin` will |
| 196 | +prepend padding to the file. |
| 197 | + |
| 198 | +For example, if `input.elf` has its lowest segment starting at 0x8000, |
| 199 | +then you'll normally get an output file beginning with the data of |
| 200 | +that segment. But adding `--base 0x6000` will give an output file |
| 201 | +beginning with 0x2000 zero bytes, so that you could load the whole |
| 202 | +file beginning at address 0x6000 and all the segments would end up in |
| 203 | +the right places. |
| 204 | + |
| 205 | +#### `--banks`: split the output between banks intended for separate ROMs |
| 206 | + |
| 207 | +In binary and VHX formats, you can use `--banks` to request the output |
| 208 | +split up into interleaved banks, for example so that you can direct a |
| 209 | +CPU's 32-bit data bus to four ROMs each with an 8-bit data bus. |
| 210 | + |
| 211 | +The argument to `--banks` consists of two numbers separated by an `x`. |
| 212 | +The first number is the 'width' of each bank: the number of |
| 213 | +consecutive bytes of data that go into each bank file before moving on |
| 214 | +to the next. The second is the number of banks. |
| 215 | + |
| 216 | +For example, `--banks 2x4` generates four banks, each of which |
| 217 | +receives 2 consecutive bytes of the data in turn. That is, the output |
| 218 | +file for bank 0 would get all the bytes intended to end up in memory |
| 219 | +at addresses 0,1 (mod 8), bank 1 would get addresses 2,3 (mod 8), bank |
| 220 | +2 would get 4,5 and bank 3 would get 6,7. |
| 221 | + |
| 222 | +#### `--datareclen`: control data record length in hex output formats |
| 223 | + |
| 224 | +In the record-based hex formats `--ihex` and `--srec`, you can use |
| 225 | +`--datareclen` to control the number of bytes of the ELF file that |
| 226 | +appear in each data record. By default this is 16. The upper limit is |
| 227 | +different for the two formats. |
| 228 | + |
| 229 | +#### `--segments`: control which loadable segments to output |
| 230 | + |
| 231 | +You can use `--segments` to restrict `elf2bin` to writing only a |
| 232 | +subset of the loadable segments in the ELF file. |
| 233 | + |
| 234 | +The argument is a comma-separated list of base addresses. |
| 235 | + |
| 236 | +For example, if you had an input file containing segments at addresses |
| 237 | +0x8000, 0x20000 and 0x10000000, then `--segments 0x8000,0x10000000` |
| 238 | +would skip the middle one. This option applies to all output modes. |
| 239 | + |
| 240 | +#### `--physical` and `--virtual`: choose which segment address field to use |
| 241 | + |
| 242 | +In the ELF program header table, each segment has a 'physical address' |
| 243 | +and 'virtual address' field, called `p_paddr` and `p_vaddr` |
| 244 | +respectively in the ELF specification. Some ELF files set the two |
| 245 | +addresses differently, to indicate that the image is loaded into |
| 246 | +memory in one layout and then remapped (or physically moved) into a |
| 247 | +different layout to be run. |
| 248 | + |
| 249 | +By default `elf2bin` uses the physical address field as the address of |
| 250 | +the segment. You can use `--virtual` to make it use the virtual |
| 251 | +address field instead. |
| 252 | + |
| 253 | +(The `--physical` option is also provided, to explicitly ask for the |
| 254 | +physical address.) |
| 255 | + |
| 256 | +#### `--zi`: include zero-initialized data after each segment |
| 257 | + |
| 258 | +Normally, `elf2bin` treats each segment as containing only the bytes |
| 259 | +actually stored in the ELF file. That is, the segment is treated as |
| 260 | +having length corresponding to its `p_filesz` field, not its |
| 261 | +`p_memsz`. |
| 262 | + |
| 263 | +You can use `--zi`, in any mode, to tell `elf2bin` to include zero |
| 264 | +padding after each segment to bring it up to its `p_memsz` length. |
| 265 | + |
| 266 | +(If the ELF file specifies different physical and virtual addresses |
| 267 | +for each segment, then this option probably makes more sense in |
| 268 | +combination with `--virtual`, since the physical layout might pack all |
| 269 | +the segments tightly together without leaving room for the |
| 270 | +zero-initialized trailer of each one.) |
0 commit comments