You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FUNCTION FLUSH re-create lua VM, fix flush not gc, fix flush async + load crash (#1826)
There will be two issues in this test:
```
test {FUNCTION - test function flush} {
for {set i 0} {$i < 10000} {incr i} {
r function load [get_function_code LUA test_$i test_$i {return 'hello'}]
}
set before_flush_memory [s used_memory_vm_functions]
r function flush sync
set after_flush_memory [s used_memory_vm_functions]
puts "flush sync, before_flush_memory: $before_flush_memory, after_flush_memory: $after_flush_memory"
for {set i 0} {$i < 10000} {incr i} {
r function load [get_function_code LUA test_$i test_$i {return 'hello'}]
}
set before_flush_memory [s used_memory_vm_functions]
r function flush async
set after_flush_memory [s used_memory_vm_functions]
puts "flush async, before_flush_memory: $before_flush_memory, after_flush_memory: $after_flush_memory"
for {set i 0} {$i < 10000} {incr i} {
r function load [get_function_code LUA test_$i test_$i {return 'hello'}]
}
puts "Test done"
}
```
The first one is the test output, we can see that after executing
FUNCTION FLUSH,
used_memory_vm_functions has not changed at all:
```
flush sync, before_flush_memory: 2962432, after_flush_memory: 2962432
flush async, before_flush_memory: 4504576, after_flush_memory: 4504576
```
The second one is there is a crash when loading the functions during the
async
flush:
```
=== VALKEY BUG REPORT START: Cut & paste starting from here ===
# valkey 255.255.255 crashed by signal: 11, si_code: 2
# Accessing address: 0xe0429b7100000a3c
# Crashed running the instruction at: 0x102e0b09c
------ STACK TRACE ------
EIP:
0 valkey-server 0x0000000102e0b09c luaH_getstr + 52
Backtrace:
0 libsystem_platform.dylib 0x000000018b066584 _sigtramp + 56
1 valkey-server 0x0000000102e01054 luaD_precall + 96
2 valkey-server 0x0000000102e01b10 luaD_call + 104
3 valkey-server 0x0000000102e00d1c luaD_rawrunprotected + 76
4 valkey-server 0x0000000102e01e3c luaD_pcall + 60
5 valkey-server 0x0000000102dfc630 lua_pcall + 300
6 valkey-server 0x0000000102f77770 luaEngineCompileCode + 708
7 valkey-server 0x0000000102f71f50 scriptingEngineCallCompileCode + 104
8 valkey-server 0x0000000102f700b0 functionsCreateWithLibraryCtx + 2088
9 valkey-server 0x0000000102f70898 functionLoadCommand + 312
10 valkey-server 0x0000000102e3978c call + 416
11 valkey-server 0x0000000102e3b5b8 processCommand + 3340
12 valkey-server 0x0000000102e563cc processInputBuffer + 520
13 valkey-server 0x0000000102e55808 readQueryFromClient + 92
14 valkey-server 0x0000000102f696e0 connSocketEventHandler + 180
15 valkey-server 0x0000000102e20480 aeProcessEvents + 372
16 valkey-server 0x0000000102e4aad0 main + 26412
17 dyld 0x000000018acab154 start + 2476
------ STACK TRACE DONE ------
```
The reason is that, in the old implementation (introduced in 7.0),
FUNCTION FLUSH
use lua_unref to remove the script from lua VM. lua_unref does not
trigger the gc,
it causes us to not be able to effectively reclaim memory after the
FUNCTION FLUSH.
The other issue is that, since we don't re-create the lua VM in FUNCTION
FLUSH,
loading the functions during a FUNCTION FLUSH ASYNC will result a crash
because
lua engine state is not thread-safe.
The correct solution is to re-create a new Lua VM to use, just like
SCRIPT FLUSH.
---------
Signed-off-by: Binbin <[email protected]>
Signed-off-by: Ricardo Dias <[email protected]>
Co-authored-by: Ricardo Dias <[email protected]>
(cherry picked from commit b4c93cc)
Signed-off-by: cherukum-amazon <[email protected]>
0 commit comments