Skip to content

Commit 7516508

Browse files
committed
understanding ELFs part 3
Signed-off-by: innocentzero <md-isfarul-haque@proton.me>
1 parent b111320 commit 7516508

File tree

1 file changed

+79
-0
lines changed

1 file changed

+79
-0
lines changed
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
+
2+
title = "Understanding ELFs, part 3"
3+
date = 2025-01-30
4+
authors = ["InnocentZero"]
5+
+++
6+
7+
## On relocations, loading binaries, and more
8+
9+
The reason we need relocations is because of a simple fact, the existence of shared libraries.
10+
11+
One question anyone may ask is the necessity of having shared libraries. That is done to avoid
12+
repitition of pages in memory, a thing which was critical in older days because of low memory.
13+
Another thing to note is that there is separation of the library and the binary. The library can
14+
be updated without updating the binary as such.
15+
16+
This is dealt with by using _relocation sections_. These contain the info needed to do the
17+
relocation of the symbol within the binary's context. The section usually links to an additional
18+
section where the relocation is going to happen.
19+
20+
There are two ways in which object files may be linked: statically and dynamically.
21+
22+
Static linking is fairly straightforward, the linker takes in all the object files and archive
23+
files (=libc.a=) and creates a single self-contained binary containing all the required
24+
functionality. This is done at the end of compilation itself.
25+
26+
Dynamic linking is a slightly more complex and involved process. It defers the linking part from
27+
compile time to runtime. The binary contains the information about its choice of runtime linker
28+
(also referred to as an _interpreter_) and the dynamic symbols and how to obtain them.
29+
30+
31+
## Loading an ELF on the memory
32+
33+
The system first executes the file's "interpreter" before handing over execution to the binary.
34+
Over here, the interpreter is obtained from the `.interp` section in the `PT_INTERP` segment in
35+
memory. This can be read using `readelf -p .interp example`.
36+
37+
```
38+
$ readelf -p .interp example
39+
40+
String dump of section '.interp':
41+
[ 0] /lib64/ld-linux-x86-64.so.2
42+
```
43+
44+
The interpreter loads the binary into memory first.
45+
46+
The interpreter sets up the environment using the `.dynamic` section of the binary. This can be
47+
seen using `readelf -d executable`.
48+
49+
In this, the interpreter will recursively begin visiting all the **NEEDED** dynamic libraries to be
50+
loaded into memory. For each dependency, the following steps are executed:
51+
52+
- The ELF is mapped into memory.
53+
- Relocations are performed, in the original binary we patch all the absolute addresses and
54+
resolve references to other object files.
55+
- Its dynamic table is parsed and dependencies loaded.
56+
- Run `dl_init`, which executes all the functions from `INIT`, and `INIT_ARRAY` for the just loaded
57+
libraries.
58+
59+
Now the control is handed over to `_start` in the ELF binary. That gets the pointer to `_dl_fini`
60+
in `rdx`. This prepares the stack with a few arguments and calls `_libc_start_main`.
61+
62+
`_libc_start_main` receives a function pointer to `main`, `init`, `fini`, and `rtld_fini` (this is the
63+
same as `dl_fini`).
64+
65+
This function has a bunch of things going on, such as setting up of thread local storage and
66+
such. Here we only care about two things:
67+
68+
- `__cxa_atexit__` which sets up `_dl_fini` as the destructor after the program is done.
69+
70+
- A call to `call_init` that run the constructors in the `INIT` and `INIT_ARRAY` dynamic table
71+
entries. Note that `dl_init` was for the entries in the shared libraries themselves, but this
72+
is for the binary.
73+
74+
- Finally, control after this is handed over to `main`.
75+
76+
- Immediately after `main`, `exit` is called. This only transfers the control to
77+
`__run_exit_handlers`.
78+
79+
- This runs all the functions registered in `__exit_funcs` which also contains `_dl_fini`.

0 commit comments

Comments
 (0)