Skip to content

[BUG] Race condition in Socket_getReadySocket causes crash during poll/WSAPoll call on Windows #1650

@HammerCheng

Description

@HammerCheng

Describe the bug
A race condition in Socket_getReadySocket() causes a crash inside WSAPoll() (or poll()) when another thread modifies the global mod_s structure while the mutex is temporarily unlocked during the blocking I/O wait.

To Reproduce
The issue is in Socket.c, Socket_getReadySocket() function. The code unlocks the mutex before calling poll()/WSAPoll(), leaving the global mod_s structure unprotected:
// Socket.c - Problematic code pattern
Paho_thread_unlock_mutex(mutex); // ⚠️ Mutex released
*rc = poll(mod_s.saved.fds_read, mod_s.saved.nfds, timeout_ms); // ⚠️ Using global data without protection!
Paho_thread_lock_mutex(mutex); // Mutex re-acquired

Disassembly evidence (Windows x86):
6EF50CC3 push edi ; mutex
6EF50CC4 call _Paho_thread_unlock_mutex ; unlock
6EF50CCC push dword ptr [ebp-10h] ; timeout
6EF50CCF push dword ptr ds:[52F10398h] ; mod_s.saved.nfds
6EF50CD5 push dword ptr ds:[52F103A0h] ; mod_s.saved.fds_read (pointer!)
6EF50CDB call dword ptr ds:[52E8B0D0h] ; WSAPoll() - CRASH HERE
6EF50CE4 call _Paho_thread_lock_mutex ; lock

Race Condition Timeline
┌──────────────────────────────────────────────────────────────────────┐
│ receiveThread │ Other Thread (Application) │
├──────────────────────────────────────────────────────────────────────┤
│ Socket_getReadySocket() │ │
│ → memcpy saved.fds_read from fds_read│ │
│ → unlock_mutex(socket_mutex) │ │
│ → preparing to call poll()... │ MQTTAsync_disconnect() │
│ │ → MQTTAsync_closeOnly() │
│ │ → Socket_close(socket) │
│ │ → Modifies mod_s.nfds │
│ │ → free(mod_s.fds_read) │
│ → poll(saved.fds_read, saved.nfds) │ │
│ 💥 CRASH: Access freed memory or │ │
│ invalid socket descriptor! │ │
└──────────────────────────────────────────────────────────────────────┘

Crash Stack Trace
ws2_32.dll!778b3384()
mswsock.dll!72f6a84d()
ws2_32.dll!WSAPoll()
paho-mqtt3as.dll!Socket_getReadySocket() <-- Crash occurs here
paho-mqtt3as.dll!MQTTAsync_cycle()
paho-mqtt3as.dll!MQTTAsync_receiveThread()

6EF50B02 push esi
6EF50B03 push ecx
6EF50B04 call _memcpy (6EF5A3A4h)
6EF50B09 mov eax,dword ptr ds:[52F1039Ch]
6EF50B0E add esp,0Ch
6EF50B11 mov ecx,dword ptr ds:[52F10398h]
6EF50B17 test ecx,ecx
6EF50B19 je _Socket_getReadySocket+4F7h (6EF50DA7h)
6EF50B1F push 0
6EF50B21 push ecx
6EF50B22 push eax
6EF50B23 call dword ptr [__imp__WSAPoll@12 (6EF5B0D0h)]
6EF50B29 mov esi,eax
6EF50B2B mov dword ptr [ebp-14h],esi
6EF50B2E test esi,esi
6EF50B30 jle _Socket_getReadySocket+413h (6EF50CC3h)
6EF50B36 mov ecx,dword ptr ds:[52F10384h]
6EF50B3C push 3
6EF50B3E mov dword ptr [ebp-0Ch],0
6EF50B45 push 600h
6EF50B4A mov ecx,dword ptr [ecx]
6EF50B4C push 52E8EA40h
6EF50B51 mov dword ptr [ebp-4],ecx
6EF50B54 call _StackTrace_entry (6EF58C60h)

Expected behavior

Screenshots

Log files
no exception trace log

** Environment (please complete the following information):**
• Paho MQTT C Version: 1.3.15
• Operating System: Windows 10/11 (x64)
• Compiler: MSVC (Visual Studio 2022)
• Build Configuration: Release, x86

Additional context
• This issue is more likely to occur under high load or when disconnect is called immediately after connect
• The crash is intermittent and timing-dependent
• Similar issues may exist in both Windows (WSAPoll) and POSIX (poll) code paths

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions