-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Unfortunately, our socket close is racy:
-
We shut down the socket here, but the TCP FIN message is not sent immediately because that is done by the TCP/IP thread which is not running.
-
Then we remove the firewall endpoint here. This is done immediately.
-
By the time the TCP/IP thread gets a chance to send the FIN message, the firewall blocks the outgoing packet.
We may not have detected this earlier because of the type of workloads we were running, but this is super obvious in the server API case I am about to PR.
Notes on how to fix this
Unfortunately, even if we add a quick trylock on the ipThreadLockState (which is there precisely to wake up the IP thread for that reason), it only allows the IP thread to send out the FIN but not to ACK the FIN ACK sent by the other party. Then we get a whole lot of retransmissions.
We cannot rely on a FreeRTOS+TCP callback (such as FREERTOS_SO_TCP_CONN_HANDLER) to tackle this. This is because FreeRTOS+TCP does not do a three-way handshake close when the socket is closed through FreeRTOS_socketclose(). Instead, FreeRTOS+TCP sends a FIN to the peer and destroys the socket immediately. Any further packet from the peer is answered with a RST (as a spurious packet). Because the socket is destroyed in a state when the TCP connection has not been properly closed (it is just in the "FIN sent" stage, i.e., FIN Wait-1), the callback is not called.
The only viable way to fix this seems to be adding a "shutting down list" to the firewall that either lets packets through or replies with a RST, until a timer expires.