Skip to content

Commit 3222936

Browse files
committed
Add new documents to explain static and dynamic linking modes
Since the compiler supports both static linking and dynamic linking, this commit adds two new documents to explain the following: - Describe how to build static/dynamic linking version of shecc. - Stack frame layout in static/dynamic linking modes. - Function arguments handling and calling convention. - Runtime execution flow. - Explain the dynamic sections for dynamic linking mode.
1 parent 2c05ea5 commit 3222936

File tree

2 files changed

+195
-0
lines changed

2 files changed

+195
-0
lines changed

docs/dynamic-linking.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Dynamic Linking
2+
3+
## Build dynamically linked shecc and programs
4+
5+
Build the dynamically linked version of shecc, but notice that shecc currently doesn't support dynamic linking for the RISC-V architecture:
6+
7+
```shell
8+
$ make ARCH=arm DYNLINK=1
9+
```
10+
11+
Next, you can use shecc to build dynamically linked programs by adding the `--dynlink` flag:
12+
13+
```shell
14+
# Use the stage 0 compiler
15+
$ out/shecc --dynlink -o <output> <input.c>
16+
# Use the stage 1 or stage 2 compiler
17+
$ qemu-arm -L <LD_PREFIX> out/shecc-stage2.elf --dynlink -o <output> <input.c>
18+
19+
# Execute the compiled program
20+
$ qemu-arm -L <LD_PREFIX> <output>
21+
```
22+
23+
When executing a dynamically linked program, you should set the ELF interpreter prefix so that `ld.so` can be invoked. Generally, it should be `/usr/arm-linux-gnueabihf` if you have installed the ARM GNU toolchain by `apt`. Otherwise, you should find and specify the correct path if you manually installed the toolchain.
24+
25+
## Stack frame layout
26+
27+
In dynamic linking mode, the stack frame layout for each function can be illustrated as follows:
28+
29+
```
30+
High Address
31+
+------------------+
32+
| incoming args |
33+
+------------------+ <- sp + total_size
34+
| saved lr |
35+
+------------------+ <- sp + total_size - 4
36+
| local variables |
37+
+------------------+ <- sp + 20
38+
| saved r12 (ip) |
39+
+------------------+ <- sp + 16
40+
| outgoing args |
41+
+------------------+ <- sp (MUST be aligned to 8 bytes)
42+
Low Address
43+
```
44+
45+
* `total_size`: includes the size of the following elements:
46+
* `outgoing args`: a fixed size - 16 bytes
47+
* `saved r12`: a fixed size - 4 bytes
48+
* All local variables
49+
* `saved lr`: a fixed size - 4 bytes
50+
51+
52+
Currently, since the maximal number of arguments is 8, an additional 20 bytes of stack space are allocated for outgoing arguments and register `r12`.
53+
54+
For the Arm architecture, when the callee is an external function, the caller uses the first 16 bytes to push extra arguments onto stack to comply with calling convention..
55+
56+
In addition, because external functions may modify register `r12`, which holds the pointer of the global stack, the caller also preserves its value at `[sp + 16]` and restores it after the external function returns.
57+
58+
## About function arguments handling
59+
60+
### Arm (32-bit)
61+
62+
If the callee is an internal function meaning that its implementation is compiled by shecc, the caller directly puts all arguments into register `r0` - `r7`.
63+
64+
Conversely, the caller performs the following operations to comply with the Arm Architecture Procedure Call Standard (AAPCS).
65+
66+
* First four arguments are put into `r0` - `r3`
67+
* Other additional arguments are passed to stack. Arguments are pushed onto stack starting from the last argument, so the fifth argument is at the lower address and the last argument is at the higher address.
68+
* Align the stack pointer to 8 bytes, as external functions may access 8-byte objects, which require 8-byte alignment.
69+
70+
### RISC-V (32-bit)
71+
72+
(Currently not supported)
73+
74+
## Runtime execution flow
75+
76+
1. Program starts at ELF entry point.
77+
2. Dynamic linker (`ld.so`) is invoked.
78+
* For the Arm architecture, the dynamic linker is `/lib/ld-linux-armhf.so.3`.
79+
3. Linker loads shared libraries such as `libc.so`.
80+
4. Linker resolves symbols and fills global offset table (GOT).
81+
5. Control transfers to the program.
82+
6. Program executes `__libc_start_main` at the beginning.
83+
7. `__libc_start_main` calls the *main wrapper*, which sets up a global stack for all global variables (but excluding read-only variables) and initializes them.
84+
8. Execute the *main wrapper*.
85+
9. After the *main wrapper* completes, it passes `argc` and `argv` to registers correctly, then jumps to the `main` function to continue execution.
86+
10. After the `main` function returns, `__libc_start_main` implicitly calls `exit(3)` to terminate the program.
87+
88+
## Dynamic sections
89+
90+
When using dynamic linking, the following sections are generated for compiled programs:
91+
92+
1. `.interp` - Path to dynamic linker
93+
2. `.dynsym` - Dynamic symbol table
94+
3. `.dynstr` - Dynamic string table
95+
4. `.rel.plt` - PLT relocations
96+
5. `.plt` - Procedure Linkage Table
97+
6. `.got` - Global Offset Table
98+
7. `.dynamic` - Dynamic linking information
99+
100+
### PLT explanation for Arm32
101+
102+
The first entry contains the following instructions to invoke resolver to perform relocation.
103+
104+
```
105+
push {lr} @ (str lr, [sp, #-4]!)
106+
movw sl, #:lower16:(&GOT[2])
107+
movt sl, #:upeer16:(&GOT[2])
108+
mov lr, sl
109+
ldr pc, [lr]
110+
```
111+
112+
1. Push register `lr` onto stack.
113+
2. Set register `sl` to the address of `GOT[2]`.
114+
3. Move the value of `sl` to `lr`.
115+
4. Load the value located at `[lr]` into the program counter (`pc`).
116+
117+
118+
119+
The remaining entries correspond to all external functions, with each entry including the following instructions:
120+
121+
```
122+
movw ip, #:lower16:(&GOT[x])
123+
movt ip, #:upper16:(&GOT[x])
124+
ldr pc, [ip]
125+
```
126+
127+
1. Set register `ip` to the address of `GOT[x]`.
128+
2. Assign register `pc` to the value of `GOT[x]`. That is, set `pc` to the address of the callee.
129+

docs/static-linking.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Static Linking
2+
3+
## Build statically linked shecc and programs
4+
5+
Build the statically linked version of shecc:
6+
7+
```shell
8+
$ make ARCH=<target arch>
9+
```
10+
11+
Next, you can use shecc to generate statically linked programs. The following demonstration uses shecc targeting Arm architecture to illustrate:
12+
13+
```shell
14+
# Use the stage 0 compiler
15+
$ out/shecc -o <output> <input.c>
16+
# Use the stage 1 or stage 2 compiler
17+
$ qemu-arm out/shecc-stage2.elf -o <output> <input.c>
18+
19+
# Execute the compiled program
20+
$ qemu-arm <output>
21+
```
22+
23+
## Stack frame layout
24+
25+
In static linking mode, the stack frame layout for each function can be illustrated as follows:
26+
27+
```
28+
High Address
29+
+------------------+ <- sp + total_size
30+
| saved lr |
31+
+------------------+ <- sp + total_size - 4
32+
| local variables |
33+
+------------------+ <- sp + 20
34+
| (unused space) |
35+
+------------------+ <- sp (may be aligned to 8 bytes)
36+
Low Address
37+
```
38+
39+
* `total_size`: the total size of all local variables plus an extra space for preserving register `lr` and an unused space.
40+
* For Arm32, the total size will be aligned to 8 bytes by the code generator.
41+
* The size of the unused space is 20 bytes and is only used in dynamic linking mode.
42+
43+
When a function completes execution, it restores the caller's stack pointer by subtracting `total_size` from `sp`, retrieves the return address from `[sp - 4]` and transfers control back to the caller.
44+
45+
## About function arguments handling
46+
47+
In the current implementation, the maximal number of arguments that shecc can handle is 8.
48+
49+
### Arm (32-bit)
50+
51+
In the Arm Architecture Procedure Calling Standard (AAPCS), if the number of arguments is greater than 4, only the first four arguments are stored in `r0` - `r3`, and the remaining arguments should be pushed onto stack. Additionally, the stack must be properly aligned.
52+
53+
However, shecc puts all arguments to register `r0` - `r7` even if the number of arguments exceeds 4. Since all functions are compiled by shecc in static linking mode, execution can still succeed by retrieving arguments from `r0` - `r7`, even though this does not comply with the AAPCS.
54+
55+
### RISC-V (32-bit)
56+
57+
In the RISC-V architecture, the maximal number of arguments that can be put into registers is 8, so shecc also puts all arguments to `a0` - `a7` directly. Therefore, the compiled programs are fully compliant with the RISC-V calling convention as long as the number of arguments does not exceed 8.
58+
59+
If shecc needs to support handling more arguments in the future, it should be improved to generate instructions to push extra arguments onto stack properly.
60+
61+
## Runtime execution flow
62+
63+
1. Program starts at ELF entry point.
64+
2. Execute the *main wrapper*, which sets up a global stack for all global variables (but excluding read-only variables) and initializes them.
65+
3. After the *main wrapper* completes, it retrieves `argc` and `argv` from stack, puts them into registers properly, and calls the `main` function to continue execution.
66+
4. After the `main` function returns, use the `_exit` system call to terminate the program.

0 commit comments

Comments
 (0)