-
Notifications
You must be signed in to change notification settings - Fork 521
Description
I discovered this issue while developing a LAN driver using websockets; I assume it affects any Edge driver using lustre WebSockets where the connected device may reboot abruptly.
The lustre WebSocket library has a bug that causes a FATAL crash when the underlying socket connection is abruptly terminated (e.g., when a connected device reboots). The crash occurs in lustre's internal _receive_loop when it tries to access a nil recv value.
Steps to Reproduce
- Create a SmartThings Edge driver that uses lustre WebSocket for device communication
- Establish a WebSocket connection to a device using the standard pattern:
local lustre = require("lustre")
local ws = lustre.WebSocket.client(sock, path, config)
ws:connect(host, port)- While the WebSocket is connected and receiving messages, abruptly reboot the device
- The driver crashes with a FATAL error
Expected Behavior
The lustre library should handle connection losses gracefully, returning an error that the driver can catch and handle (e.g., trigger reconnection logic).
Actual Behavior
The driver crashes with the following FATAL error:
FATAL Feller Wiser Gateway Driver runtime error: [string "cosock.lua"]:250: [string "lustre/ws.lua"]:299: attempt to index a nil value (local 'recv')
stack traceback:
[string "lustre/ws.lua"]:299: in method '_handle_recvs'
[string "lustre/ws.lua"]:286: in method '_receive_loop'
[string "lustre/ws.lua"]:178: in function <[string "lustre/ws.lua"]:177>Probable Root Cause
Looking at the lustre source code, the issue is in the _receive_loop function:
local recv, _, err = socket.select(rs, nil, self.config._keep_alive)
if not recv then
if self:_handle_select_err(loop_state, err) then
return
end
end
if self:_handle_recvs(loop_state, recv, 1) then -- Line 299: recv can be nil here!
break
endThe bug occurs when:
- socket.select() returns recv = nil due to a socket error
- _handle_select_err() is called but returns nil for non-timeout errors
- Execution continues to _handle_recvs() which tries to access recv[1] when recv is nil
Proposed Fix
Add proper control flow to prevent calling _handle_recvs when recv is nil:
if not recv then
if self:_handle_select_err(loop_state, err) then
return
end
else -- Only call _handle_recvs if recv is not nil
if self:_handle_recvs(loop_state, recv, 1) then
break
end
endEnvironment
- SmartThings Edge Hub 0.56.11
- Built-in lustre library (not vendored)
- Occurs when devices reboot