Fix SMP deadlock: consistent lock ordering in enif_monitor_process #2057
+44
−18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
User was reporting random but certain deadlocks when testing httpd webserver - this fixes the ABBA deadlock.
Entirely by AI:
https://ampcode.com/threads/T-019ba1c2-2a7f-77c7-bd33-ce9f303152a2
Verified as a fix, repeating load testing for multiple hours..
Summary
Fix a lock ordering inversion that causes deadlocks under SMP on ESP32 (and potentially other platforms) when sockets are used under heavy load.
Problem
enif_monitor_processandenif_demonitor_processacquire locks in opposite orders:enif_monitor_processprocesses_table→monitorsenif_demonitor_processmonitors→processes_tabledestroy_resource_monitorsmonitors→processes_tableThis creates an ABBA deadlock when two threads call these functions concurrently—one holds
processes_tablewaiting formonitors, while the other holdsmonitorswaiting forprocesses_table.The issue is triggered by
otp_socket.cwhich calls both monitor/demonitor from NIFs, the select thread, and monitor callbacks under load.With
AVM_NO_SMP,synclist_wrlockis a no-op so no deadlock occurs, which explains why disabling SMP works around the issue.Fix
Change
enif_monitor_processto acquire locks in the same order as the other functions:monitors→processes_table.Testing
These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).
SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later