|
| 1 | ++++ |
| 2 | +title = "Understanding ELFs, part 1" |
| 3 | +date = 2025-01-08 |
| 4 | +authors = ["InnocentZero"] |
| 5 | ++++ |
| 6 | + |
| 7 | +In this post we analyze the header and sections of an ELF binary on disk. |
| 8 | + |
| 9 | +## The header |
| 10 | + |
| 11 | +ELF files have a header section that can be read with `readelf -h executable` which gives you quite |
| 12 | +a bit of information about the binary. |
| 13 | + |
| 14 | +``` |
| 15 | +ELF Header: |
| 16 | +Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |
| 17 | +Class: ELF64 |
| 18 | +Data: 2's complement, little endian |
| 19 | +Version: 1 (current) |
| 20 | +OS/ABI: UNIX - System V |
| 21 | +ABI Version: 0 |
| 22 | +Type: DYN (Position-Independent Executable file) |
| 23 | +Machine: Advanced Micro Devices X86-64 |
| 24 | +Version: 0x1 |
| 25 | +Entry point address: 0x1040 |
| 26 | +Start of program headers: 64 (bytes into file) |
| 27 | +Start of section headers: 13520 (bytes into file) |
| 28 | +Flags: 0x0 |
| 29 | +Size of this header: 64 (bytes) |
| 30 | +Size of program headers: 56 (bytes) |
| 31 | +Number of program headers: 13 |
| 32 | +Size of section headers: 64 (bytes) |
| 33 | +Number of section headers: 30 |
| 34 | +Section header string table index: 29 |
| 35 | +``` |
| 36 | + |
| 37 | +Needless to say, a lot of this is just metadata about the binary that is read by the OS to load |
| 38 | +the binary. |
| 39 | + |
| 40 | +## The sections |
| 41 | +ELF sections comprise all all the information that is needed to build an executable from an |
| 42 | +object file. They are only needed during compile time and not runtime. However, some of these |
| 43 | +sections may get mapped to segments during runtime. `readelf -S executable` tells you the |
| 44 | +sections. |
| 45 | + |
| 46 | +Some of the more important ones are: |
| 47 | + |
| 48 | +- `.text`: The instructions of the binary are contained here. They are executed and `rip` moves |
| 49 | + through this section. |
| 50 | +- `.data/.rodata`: This are the sections that contain initialized global data. _ro_ stands for |
| 51 | + read-only. |
| 52 | +- `.bss`: This is the section for uninitialized global variables. |
| 53 | +- `.interp`: This holds the runtime linker, also known as the /interpreter/ of the program. |
| 54 | +- Some linker scripts may also contain preallocated space for stack and heap, although it's not |
| 55 | + really the job of ELF sections to define them. |
| 56 | + |
| 57 | +For an example /hello world/ binary in C, the following was the output for `readelf -S` |
| 58 | + |
| 59 | +``` |
| 60 | + There are 30 section headers, starting at offset 0x34d0: |
| 61 | +
|
| 62 | +Section Headers: |
| 63 | + [Nr] Name Type Address Offset |
| 64 | + Size EntSize Flags Link Info Align |
| 65 | + [ 0] NULL 0000000000000000 00000000 |
| 66 | + 0000000000000000 0000000000000000 0 0 0 |
| 67 | + [ 1] .interp PROGBITS 0000000000000318 00000318 |
| 68 | + 000000000000001c 0000000000000000 A 0 0 1 |
| 69 | + [ 2] .note.gnu.pr[...] NOTE 0000000000000338 00000338 |
| 70 | + 0000000000000040 0000000000000000 A 0 0 8 |
| 71 | + [ 3] .note.gnu.bu[...] NOTE 0000000000000378 00000378 |
| 72 | + 0000000000000024 0000000000000000 A 0 0 4 |
| 73 | + [ 4] .note.ABI-tag NOTE 000000000000039c 0000039c |
| 74 | + 0000000000000020 0000000000000000 A 0 0 4 |
| 75 | + [ 5] .gnu.hash GNU_HASH 00000000000003c0 000003c0 |
| 76 | + 000000000000001c 0000000000000000 A 6 0 8 |
| 77 | + [ 6] .dynsym DYNSYM 00000000000003e0 000003e0 |
| 78 | + 00000000000000a8 0000000000000018 A 7 1 8 |
| 79 | + [ 7] .dynstr STRTAB 0000000000000488 00000488 |
| 80 | + 000000000000008f 0000000000000000 A 0 0 1 |
| 81 | + [ 8] .gnu.version VERSYM 0000000000000518 00000518 |
| 82 | + 000000000000000e 0000000000000002 A 6 0 2 |
| 83 | + [ 9] .gnu.version_r VERNEED 0000000000000528 00000528 |
| 84 | + 0000000000000030 0000000000000000 A 7 1 8 |
| 85 | + [10] .rela.dyn RELA 0000000000000558 00000558 |
| 86 | + 00000000000000c0 0000000000000018 A 6 0 8 |
| 87 | + [11] .rela.plt RELA 0000000000000618 00000618 |
| 88 | + 0000000000000018 0000000000000018 AI 6 23 8 |
| 89 | + [12] .init PROGBITS 0000000000001000 00001000 |
| 90 | + 000000000000001b 0000000000000000 AX 0 0 4 |
| 91 | + [13] .plt PROGBITS 0000000000001020 00001020 |
| 92 | + 0000000000000020 0000000000000010 AX 0 0 16 |
| 93 | + [14] .text PROGBITS 0000000000001040 00001040 |
| 94 | + 0000000000000141 0000000000000000 AX 0 0 16 |
| 95 | + [15] .fini PROGBITS 0000000000001184 00001184 |
| 96 | + 000000000000000d 0000000000000000 AX 0 0 4 |
| 97 | + [16] .rodata PROGBITS 0000000000002000 00002000 |
| 98 | + 0000000000000015 0000000000000000 A 0 0 4 |
| 99 | + [17] .eh_frame_hdr PROGBITS 0000000000002018 00002018 |
| 100 | + 0000000000000024 0000000000000000 A 0 0 4 |
| 101 | + [18] .eh_frame PROGBITS 0000000000002040 00002040 |
| 102 | + 000000000000007c 0000000000000000 A 0 0 8 |
| 103 | + [19] .init_array INIT_ARRAY 0000000000003dd0 00002dd0 |
| 104 | + 0000000000000008 0000000000000008 WA 0 0 8 |
| 105 | + [20] .fini_array FINI_ARRAY 0000000000003dd8 00002dd8 |
| 106 | + 0000000000000008 0000000000000008 WA 0 0 8 |
| 107 | + [21] .dynamic DYNAMIC 0000000000003de0 00002de0 |
| 108 | + 00000000000001e0 0000000000000010 WA 7 0 8 |
| 109 | + [22] .got PROGBITS 0000000000003fc0 00002fc0 |
| 110 | + 0000000000000028 0000000000000008 WA 0 0 8 |
| 111 | + [23] .got.plt PROGBITS 0000000000003fe8 00002fe8 |
| 112 | + 0000000000000020 0000000000000008 WA 0 0 8 |
| 113 | + [24] .data PROGBITS 0000000000004008 00003008 |
| 114 | + 0000000000000010 0000000000000000 WA 0 0 8 |
| 115 | + [25] .bss NOBITS 0000000000004018 00003018 |
| 116 | + 0000000000000008 0000000000000000 WA 0 0 1 |
| 117 | + [26] .comment PROGBITS 0000000000000000 00003018 |
| 118 | + 0000000000000036 0000000000000001 MS 0 0 1 |
| 119 | + [27] .symtab SYMTAB 0000000000000000 00003050 |
| 120 | + 0000000000000240 0000000000000018 28 6 8 |
| 121 | + [28] .strtab STRTAB 0000000000000000 00003290 |
| 122 | + 000000000000012a 0000000000000000 0 0 1 |
| 123 | + [29] .shstrtab STRTAB 0000000000000000 000033ba |
| 124 | + 0000000000000116 0000000000000000 0 0 1 |
| 125 | +Key to Flags: |
| 126 | + W (write), A (alloc), X (execute), M (merge), S (strings), I (info), |
| 127 | + L (link order), O (extra OS processing required), G (group), T (TLS), |
| 128 | + C (compressed), x (unknown), o (OS specific), E (exclude), |
| 129 | + D (mbind), l (large), p (processor specific) |
| 130 | +``` |
| 131 | + |
| 132 | + |
| 133 | +- `Nr`, `Name`, and `Size` should be obvious. |
| 134 | +- `EntSize` contains the size of the entries, if the entries in the section have fixed sizes. Like |
| 135 | + symbol tables. |
| 136 | +- `Type` will be explained later. `Address` contains the starting address of the section in the |
| 137 | + binary. This depends on the previous sections and the alignment requirements of the section. |
| 138 | +- `Offset` and `Align` should also be obvious. The fields below `Address` will be explained below. |
| 139 | + |
| 140 | +The types of sections: |
| 141 | + |
| 142 | +- `NULL`: This marks an empty section. It is the first section of the binary for demarcation |
| 143 | + purposes. Acts as a placeholder. |
| 144 | +- `PROGBITS`: These just have program-defined info, like the instructions (`.text`), the global |
| 145 | + data (`.data/.rodata`), the `.interp` section (defines the interpreter). |
| 146 | +- `DYNAMIC`: Holds dynamic linking information. It is actually a dynamic table that has tags |
| 147 | + and name/value mapping of sorts that helps the runtime linker load shared libs and stuff. |
| 148 | +- `INIT_ARRAY`: This contains an array of pointers to functions that must be executed before |
| 149 | + `main`. Only for `.init_array`. |
| 150 | +- `FINI_ARRAY`: This contains an array of pointers to functions that must be executed on `exit`. |
| 151 | + Only for `.fini_array`. |
| 152 | +- `GNU_HASH`: This is a sort of hash table for faster symbol lookup used by the dynamic linker. |
| 153 | + Used for `.gnu.hash` |
| 154 | +- `NOBITS`: Used for `.bss`, which is *zeroed out* upon loading. This contains the section having |
| 155 | + undefined global variables. |
| 156 | +- `DYNSYM`: Used for `.dynsym` section, contains the dynamic symbol table. |
| 157 | +- `STRTAB`: As the name suggests, it contains a string table. Usually it's indexed. For sections |
| 158 | + `.strtab` and `.dynstr`, which are obviously static and dynamic string tables. |
| 159 | +- `SYMTAB`: Once again, symbol table for `.symtab`. Larger than `.dynsym` as it's more detailed, but |
| 160 | + not required at runtime. |
| 161 | +- `RELA`: Contains reloc tables. These specify how to to modify certain addresses in the program |
| 162 | + to account for the layout of shared libraries or changes in addresses during linking. |
| 163 | + |
| 164 | +I'm not covering specific sections like `.got` and `.plt` in detail as they require a separate post |
| 165 | +of their own. |
| 166 | + |
| 167 | +The flags for each section have been added below in the `readelf` output. |
| 168 | + |
| 169 | +The link section is an index to another section to indicate a dependency. |
| 170 | + |
| 171 | +The info section is an index to another section to indicate additional |
| 172 | +information. |
0 commit comments