Skip to content

Commit 2cbb989

Browse files
committed
Optimize garbage collection with a generational GC
Running jit tests with AtomVM is now 20% faster. Implement BEAM's `fullsweep_after` `spawn_opt/1` option and `process_flag/2` flag. Also fix `process_flag/2` spec. Signed-off-by: Paul Guyot <pguyot@kallisys.net>
1 parent e9f3fe2 commit 2cbb989

22 files changed

+1586
-795
lines changed

doc/src/memory-management.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -922,3 +922,40 @@ match binaries, as with the case of refc binaries on the process heap.
922922
#### Deletion
923923

924924
Once all terms have been copied from the old heap to the new heap, and once the MSO list has been swept for unreachable references, the old heap is simply discarded via the `free` function.
925+
926+
### Generational Garbage Collection
927+
928+
The garbage collection described above is a *full sweep*: every live term is copied from the old heap to the new heap and the entire old heap is freed. While correct, this can be expensive for processes with large heaps, because long-lived data that has already survived previous collections must be copied again each time.
929+
930+
AtomVM implements *generational* (or *minor*) garbage collection to reduce this cost, using the same approach as BEAM. The key observation is that most terms die young: they are allocated, used briefly, and become garbage. Terms that have survived at least one collection are likely to survive many more. Generational GC exploits this by dividing the heap into two generations:
931+
932+
* **Young generation**: recently allocated terms, between the *high water mark* and the current heap pointer.
933+
* **Old (mature) generation**: terms that have survived at least one minor collection, stored in a separate old heap.
934+
935+
#### High Water Mark
936+
937+
After each garbage collection, the heap pointer position is recorded as the *high water mark*. On the next collection, terms allocated below the high water mark (i.e., terms that existed at the time of the previous collection) are considered mature. Terms allocated above the high water mark are young.
938+
939+
#### Minor Collection
940+
941+
During a minor collection:
942+
943+
1. A new young heap is allocated.
944+
2. Mature terms (below the high water mark) are *promoted*: copied to the old heap rather than the new young heap.
945+
3. Young terms that are still reachable are copied to the new young heap.
946+
4. Both the new young heap and the newly promoted old region are scanned for references, since promoted terms may reference young terms and vice versa.
947+
5. Only the young MSO list is swept; the old MSO list is preserved.
948+
6. The previous heap is freed, but the old heap persists across minor collections.
949+
950+
Because the old heap is not scanned for garbage during a minor collection, the cost is proportional to the size of the young generation rather than the entire heap.
951+
952+
#### When Full vs. Minor Collection Occurs
953+
954+
AtomVM keeps a counter (`gc_count`) of how many minor collections have occurred since the last full sweep. A full sweep is forced when:
955+
956+
* The process has never been garbage collected (no high water mark exists).
957+
* `gc_count` reaches the `fullsweep_after` threshold.
958+
* The old heap does not have enough space to accommodate promoted terms.
959+
* A `MEMORY_FORCE_SHRINK` request is made (e.g., via `erlang:garbage_collect/0`).
960+
961+
The `fullsweep_after` value can be set per-process via [`spawn_opt`](./programmers-guide.md#spawning-processes) or [`erlang:process_flag/2`](./apidocs/erlang/estdlib/erlang.md#process_flag2). The default value is 65535, meaning full sweeps are infrequent under normal operation. Setting it to `0` disables generational collection entirely, forcing a full sweep on every garbage collection event.

doc/src/programmers-guide.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -349,6 +349,7 @@ The [options](./apidocs/erlang/estdlib/erlang.md#spawn_option) argument is a pro
349349
|-----|------------|---------------|-------------|
350350
| `min_heap_size` | `non_neg_integer()` | none | Minimum heap size of the process. The heap will shrink no smaller than this size. |
351351
| `max_heap_size` | `non_neg_integer()` | unbounded | Maximum heap size of the process. The heap will grow no larger than this size. |
352+
| `fullsweep_after` | `non_neg_integer()` | 65535 | Maximum number of [minor garbage collections](./memory-management.md#generational-garbage-collection) before a full sweep is forced. Set to `0` to disable generational garbage collection. |
352353
| `link` | `boolean()` | `false` | Whether to link the spawned process to the spawning process. |
353354
| `monitor` | `boolean()` | `false` | Whether to link the spawning process should monitor the spawned process. |
354355
| `atomvm_heap_growth` | `bounded_free \| minimum \| fibonacci` | `bounded_free` | [Strategy](./memory-management.md#heap-growth-strategies) to grow the heap of the process. |

libs/estdlib/src/erlang.erl

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,7 @@
175175
-type spawn_option() ::
176176
{min_heap_size, pos_integer()}
177177
| {max_heap_size, pos_integer()}
178+
| {fullsweep_after, non_neg_integer()}
178179
| {atomvm_heap_growth, atomvm_heap_growth_strategy()}
179180
| link
180181
| monitor.
@@ -1293,7 +1294,9 @@ group_leader(_Leader, _Pid) ->
12931294
%%
12941295
%% @end
12951296
%%-----------------------------------------------------------------------------
1296-
-spec process_flag(Flag :: trap_exit, Value :: boolean()) -> pid().
1297+
-spec process_flag
1298+
(trap_exit, boolean()) -> boolean();
1299+
(fullsweep_after, non_neg_integer()) -> non_neg_integer().
12971300
process_flag(_Flag, _Value) ->
12981301
erlang:nif_error(undefined).
12991302

libs/jit/src/jit_aarch64.erl

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -163,25 +163,25 @@
163163
| {maybe_free_aarch64_register(), '&', non_neg_integer(), '!=', integer()}
164164
| {{free, aarch64_register()}, '==', {free, aarch64_register()}}.
165165

166-
% ctx->e is 0x28
167-
% ctx->x is 0x30
166+
% ctx->e is 0x50
167+
% ctx->x is 0x58
168168
-define(WORD_SIZE, 8).
169169
-define(CTX_REG, r0).
170170
-define(JITSTATE_REG, r1).
171171
-define(NATIVE_INTERFACE_REG, r2).
172-
-define(Y_REGS, {?CTX_REG, 16#28}).
173-
-define(X_REG(N), {?CTX_REG, 16#30 + (N * ?WORD_SIZE)}).
174-
-define(CP, {?CTX_REG, 16#B8}).
175-
-define(FP_REGS, {?CTX_REG, 16#C0}).
172+
-define(Y_REGS, {?CTX_REG, 16#50}).
173+
-define(X_REG(N), {?CTX_REG, 16#58 + (N * ?WORD_SIZE)}).
174+
-define(CP, {?CTX_REG, 16#E0}).
175+
-define(FP_REGS, {?CTX_REG, 16#E8}).
176176
-define(FP_REG_OFFSET(State, F),
177177
(F *
178178
case (State)#state.variant band ?JIT_VARIANT_FLOAT32 of
179179
0 -> 8;
180180
_ -> 4
181181
end)
182182
).
183-
-define(BS, {?CTX_REG, 16#C8}).
184-
-define(BS_OFFSET, {?CTX_REG, 16#D0}).
183+
-define(BS, {?CTX_REG, 16#F0}).
184+
-define(BS_OFFSET, {?CTX_REG, 16#F8}).
185185
-define(JITSTATE_MODULE, {?JITSTATE_REG, 0}).
186186
-define(JITSTATE_CONTINUATION, {?JITSTATE_REG, 16#8}).
187187
-define(JITSTATE_REDUCTIONCOUNT, {?JITSTATE_REG, 16#10}).

libs/jit/src/jit_armv6m.erl

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -165,15 +165,15 @@
165165
| {{free, armv6m_register()}, '==', {free, armv6m_register()}}.
166166

167167
% ctx->e is 0x28
168-
% ctx->x is 0x30
168+
% ctx->x is 0x2C
169169
-define(CTX_REG, r0).
170170
-define(NATIVE_INTERFACE_REG, r2).
171-
-define(Y_REGS, {?CTX_REG, 16#14}).
172-
-define(X_REG(N), {?CTX_REG, 16#18 + (N * 4)}).
173-
-define(CP, {?CTX_REG, 16#5C}).
174-
-define(FP_REGS, {?CTX_REG, 16#60}).
175-
-define(BS, {?CTX_REG, 16#64}).
176-
-define(BS_OFFSET, {?CTX_REG, 16#68}).
171+
-define(Y_REGS, {?CTX_REG, 16#28}).
172+
-define(X_REG(N), {?CTX_REG, 16#2C + (N * 4)}).
173+
-define(CP, {?CTX_REG, 16#70}).
174+
-define(FP_REGS, {?CTX_REG, 16#74}).
175+
-define(BS, {?CTX_REG, 16#78}).
176+
-define(BS_OFFSET, {?CTX_REG, 16#7C}).
177177
% JITSTATE is on stack, accessed via stack offset
178178
% These macros now expect a register that contains the jit_state pointer
179179
-define(JITSTATE_MODULE(Reg), {Reg, 0}).

libs/jit/src/jit_riscv32.erl

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -194,16 +194,16 @@
194194
| {{free, riscv32_register()}, '==', {free, riscv32_register()}}.
195195

196196
% Context offsets (32-bit architecture)
197-
% ctx->e is 0x14
198-
% ctx->x is 0x18
197+
% ctx->e is 0x28
198+
% ctx->x is 0x2C
199199
-define(CTX_REG, a0).
200200
-define(NATIVE_INTERFACE_REG, a2).
201-
-define(Y_REGS, {?CTX_REG, 16#14}).
202-
-define(X_REG(N), {?CTX_REG, 16#18 + (N * 4)}).
203-
-define(CP, {?CTX_REG, 16#5C}).
204-
-define(FP_REGS, {?CTX_REG, 16#60}).
205-
-define(BS, {?CTX_REG, 16#64}).
206-
-define(BS_OFFSET, {?CTX_REG, 16#68}).
201+
-define(Y_REGS, {?CTX_REG, 16#28}).
202+
-define(X_REG(N), {?CTX_REG, 16#2C + (N * 4)}).
203+
-define(CP, {?CTX_REG, 16#70}).
204+
-define(FP_REGS, {?CTX_REG, 16#74}).
205+
-define(BS, {?CTX_REG, 16#78}).
206+
-define(BS_OFFSET, {?CTX_REG, 16#7C}).
207207
% JITSTATE is in a1 register (no prolog, following aarch64 model)
208208
-define(JITSTATE_REG, a1).
209209
% Return address register (like LR in AArch64)

libs/jit/src/jit_x86_64.erl

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -149,28 +149,28 @@
149149
-define(WORD_SIZE, 8).
150150

151151
% Following offsets are verified with static asserts in jit.c
152-
% ctx->e is 0x28
153-
% ctx->x is 0x30
154-
% ctx->cp is 0xB8
155-
% ctx->fr is 0xC0
156-
% ctx->bs is 0xC8
157-
% ctx->bs_offset is 0xD0
152+
% ctx->e is 0x50
153+
% ctx->x is 0x58
154+
% ctx->cp is 0xE0
155+
% ctx->fr is 0xE8
156+
% ctx->bs is 0xF0
157+
% ctx->bs_offset is 0xF8
158158
-define(CTX_REG, rdi).
159159
-define(JITSTATE_REG, rsi).
160160
-define(NATIVE_INTERFACE_REG, rdx).
161-
-define(Y_REGS, {16#28, ?CTX_REG}).
162-
-define(X_REG(N), {16#30 + (N * ?WORD_SIZE), ?CTX_REG}).
163-
-define(CP, {16#B8, ?CTX_REG}).
164-
-define(FP_REGS, {16#C0, ?CTX_REG}).
161+
-define(Y_REGS, {16#50, ?CTX_REG}).
162+
-define(X_REG(N), {16#58 + (N * ?WORD_SIZE), ?CTX_REG}).
163+
-define(CP, {16#E0, ?CTX_REG}).
164+
-define(FP_REGS, {16#E8, ?CTX_REG}).
165165
-define(FP_REG_OFFSET(State, F),
166166
(F *
167167
case (State)#state.variant band ?JIT_VARIANT_FLOAT32 of
168168
0 -> 8;
169169
_ -> 4
170170
end)
171171
).
172-
-define(BS, {16#C8, ?CTX_REG}).
173-
-define(BS_OFFSET, {16#D0, ?CTX_REG}).
172+
-define(BS, {16#F0, ?CTX_REG}).
173+
-define(BS_OFFSET, {16#F8, ?CTX_REG}).
174174
-define(JITSTATE_MODULE, {0, ?JITSTATE_REG}).
175175
-define(JITSTATE_CONTINUATION, {16#8, ?JITSTATE_REG}).
176176
-define(JITSTATE_REMAINING_REDUCTIONS, {16#10, ?JITSTATE_REG}).

src/libAtomVM/context.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,8 @@ Context *context_new(GlobalContext *glb)
8181
ctx->min_heap_size = 0;
8282
ctx->max_heap_size = 0;
8383
ctx->heap_growth_strategy = BoundedFreeHeapGrowth;
84+
ctx->fullsweep_after = 65535;
85+
ctx->gc_count = 0;
8486
ctx->has_min_heap_size = 0;
8587
ctx->has_max_heap_size = 0;
8688

@@ -1209,6 +1211,14 @@ COLD_FUNC void context_dump(Context *ctx)
12091211
ct++;
12101212
}
12111213

1214+
fprintf(stderr, "\n\nHeap\n----\n");
1215+
fprintf(stderr, "young heap: %zu words\n", (size_t) (ctx->heap.heap_end - ctx->heap.heap_start));
1216+
if (ctx->heap.old_heap_start) {
1217+
fprintf(stderr, "old heap: %zu words (used: %zu)\n",
1218+
(size_t) (ctx->heap.old_heap_end - ctx->heap.old_heap_start),
1219+
(size_t) (ctx->heap.old_heap_ptr - ctx->heap.old_heap_start));
1220+
}
1221+
12121222
fprintf(stderr, "\n\nMailbox\n-------\n");
12131223
mailbox_crashdump(ctx);
12141224

src/libAtomVM/context.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,8 @@ struct Context
113113
size_t min_heap_size;
114114
size_t max_heap_size;
115115
enum HeapGrowthStrategy heap_growth_strategy;
116+
unsigned int fullsweep_after;
117+
unsigned int gc_count;
116118

117119
// saved state when scheduled out
118120
Module *saved_module;

src/libAtomVM/defaultatoms.def

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -206,6 +206,7 @@ X(EMU_ATOM, "\x3", "emu")
206206
X(JIT_ATOM, "\x3", "jit")
207207
X(EMU_FLAVOR_ATOM, "\xA", "emu_flavor")
208208
X(CODE_SERVER_ATOM, "\xB", "code_server")
209+
X(FULLSWEEP_AFTER_ATOM, "\xF", "fullsweep_after")
209210
X(LOAD_ATOM, "\x4", "load")
210211
X(JIT_X86_64_ATOM, "\xA", "jit_x86_64")
211212
X(JIT_AARCH64_ATOM, "\xB", "jit_aarch64")

0 commit comments

Comments
 (0)