Skip to content

Commit f82cae5

Browse files
authored
Update 2025-11-30-comptime-c-functions.md
1 parent 58866ec commit f82cae5

File tree

1 file changed

+8
-12
lines changed

1 file changed

+8
-12
lines changed

_posts/2025-11-30-comptime-c-functions.md

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -20,17 +20,7 @@ macro_version:
2020

2121
The best use-case I can think of for this technique is generating lookup tables at compile-time, as math functions like `sin()` *also* get optimized away.
2222

23-
# Optimization tricks
24-
25-
- `static inline` allows inlining across compilation boundaries.
26-
- `__attribute__((always_inline))` *strongly* urges compilers to inline functions.
27-
- `__builtin_unreachable()` is used to teach the optimizer which assumptions it can make about input arguments.
28-
- Passing `-O3` to the compiler tells it to optimize the code very hard.
29-
- Passing `-march=native` to the compiler tells it to make optimizations based on your specific CPU.
30-
- Constant buffer addresses + sizes let the optimizer trace through `memcpy()` calls.
31-
- All operations become statically analyzable, reducing to constants.
32-
- `assert()` calls get eliminated when conditions are provably true.
33-
- [Link-time optimization](https://en.wikipedia.org/wiki/Interprocedural_optimization) with `-flto` should allow Clang and GCC to perform these optimizations even when the code is split across several object files.
23+
[Link-time optimization](https://en.wikipedia.org/wiki/Interprocedural_optimization) with `-flto` should allow Clang and GCC to perform these optimizations even when the code is split across several object files.
3424

3525
# Generic stack
3626

@@ -40,6 +30,10 @@ I added a `main()` function to the program to prove that it doesn't crash on any
4030

4131
It's important to note that `fn_version()` and `macro_version()` get optimized just as hard, even when you remove the `main()`.
4232

33+
Here are the used optimization flags:
34+
- Passing `-O3` to the compiler tells it to optimize the code very hard.
35+
- Passing `-march=native` to the compiler tells it to make optimizations based on your specific CPU.
36+
4337
Copy of the code on [Compiler Explorer](https://godbolt.org/z/fdf5acdcn):
4438

4539
```c
@@ -65,7 +59,7 @@ typedef struct stack {
6559
size_t element_size;
6660
} stack;
6761

68-
__attribute__((always_inline))
62+
__attribute__((always_inline)) // Strongly urges compilers to inline functions
6963
static inline void stack_init(stack *s, void *buffer, size_t element_size, size_t capacity) {
7064
s->data = buffer;
7165
s->size = 0;
@@ -108,6 +102,7 @@ typedef struct {
108102
} Pair;
109103

110104
void fn_version(size_t n) {
105+
// __builtin_unreachable() tells the optimizer which assumptions it can safely make
111106
// assert() isn't aggressive enough
112107
if (n < 2) __builtin_unreachable();
113108

@@ -119,6 +114,7 @@ void fn_version(size_t n) {
119114
Pair p1 = {.a = 10, .b = 20};
120115
Pair p2 = {.a = 111, .b = sin(222.0)}; // sin() is optimized away!
121116

117+
// assert()s get optimized away when they are provably true
122118
assert(stack_push(&s, &p1) == SUCCESS);
123119
assert(stack_push(&s, &p2) == SUCCESS);
124120

0 commit comments

Comments
 (0)