Skip to content
This repository was archived by the owner on Sep 6, 2022. It is now read-only.

Commit 030b747

Browse files
mmakaayMaurice Makaay
andauthored
Fix race condition causing 'ack timeout 4' disconnects (esphome#4)
The AsyncClient::send() methods sets a boolean to true after pushing data over the TCP socket successfully using tcp_output(). It also sets a timestamp to remember at what time the data was sent. The AsyncClient::_sent() callback method reacts to ACKs coming from the connected client. This method sets the boolean to false. In the AsyncClient::_poll() method, a check is done to see if the boolean is true ("I'm waiting for an ACK") and if the time at which the data was sent is too long ago (5000 ms). If this is the case, a connection issue with the connected client is assumed and the connection is forcibly closed by the server. The race condition is when these operations get mixed up, because of multithreading behavior. The _sent() method can be called during the execution of the send() method: 1. send() sends out data using tcp_output() 2. _sent() is called because an ACK is processed, sets boolean to false 3. send() continues and sets boolean to true + timestamp to "now" After this, the data exchange with the client was successful. Data were sent and the ACK was seen. However, the boolean ended up as true, making the _poll() method think that an ACK is still to be expected. As a result, 5000 ms later, the connection is dropped. This commit fixes the code by first registering that an ACK is expected, before calling tcp_output(). This way, there is no race condition when the ACK is processed right after that call. Additionally, I changed the boolean to an integer counter value. The server might send multiple messages to the client, resulting in multiple expected ACKs. A boolean does not cover this situation. Co-authored-by: Maurice Makaay <[email protected]>
1 parent cfecaa3 commit 030b747

File tree

2 files changed

+12
-11
lines changed

2 files changed

+12
-11
lines changed

src/AsyncTCP.cpp

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -575,7 +575,7 @@ AsyncClient::AsyncClient(tcp_pcb* pcb)
575575
, _pb_cb_arg(0)
576576
, _timeout_cb(0)
577577
, _timeout_cb_arg(0)
578-
, _pcb_busy(false)
578+
, _pcb_busy(0)
579579
, _pcb_sent_at(0)
580580
, _ack_pcb(true)
581581
, _rx_last_packet(0)
@@ -783,13 +783,14 @@ size_t AsyncClient::add(const char* data, size_t size, uint8_t apiflags) {
783783
}
784784

785785
bool AsyncClient::send(){
786-
int8_t err = ERR_OK;
787-
err = _tcp_output(_pcb, _closed_slot);
788-
if(err == ERR_OK){
789-
_pcb_busy = true;
790-
_pcb_sent_at = millis();
786+
auto pcb_sent_at_backup = _pcb_sent_at;
787+
_pcb_sent_at = millis();
788+
_pcb_busy++;
789+
if (_tcp_output(_pcb, _closed_slot) == ERR_OK) {
791790
return true;
792791
}
792+
_pcb_sent_at = pcb_sent_at_backup;
793+
_pcb_busy--;
793794
return false;
794795
}
795796

@@ -869,7 +870,7 @@ int8_t AsyncClient::_connected(void* pcb, int8_t err){
869870
_pcb = reinterpret_cast<tcp_pcb*>(pcb);
870871
if(_pcb){
871872
_rx_last_packet = millis();
872-
_pcb_busy = false;
873+
_pcb_busy = 0;
873874
// tcp_recv(_pcb, &_tcp_recv);
874875
// tcp_sent(_pcb, &_tcp_sent);
875876
// tcp_poll(_pcb, &_tcp_poll, 1);
@@ -932,7 +933,7 @@ int8_t AsyncClient::_fin(tcp_pcb* pcb, int8_t err) {
932933
int8_t AsyncClient::_sent(tcp_pcb* pcb, uint16_t len) {
933934
_rx_last_packet = millis();
934935
//log_i("%u", len);
935-
_pcb_busy = false;
936+
_pcb_busy--;
936937
if(_sent_cb) {
937938
_sent_cb(_sent_cb_arg, this, len, (millis() - _pcb_sent_at));
938939
}
@@ -977,8 +978,8 @@ int8_t AsyncClient::_poll(tcp_pcb* pcb){
977978
uint32_t now = millis();
978979

979980
// ACK Timeout
980-
if(_pcb_busy && _ack_timeout && (now - _pcb_sent_at) >= _ack_timeout){
981-
_pcb_busy = false;
981+
if(_pcb_busy > 0 && _ack_timeout && (now - _pcb_sent_at) >= _ack_timeout){
982+
_pcb_busy = 0;
982983
log_w("ack timeout %d", pcb->state);
983984
if(_timeout_cb)
984985
_timeout_cb(_timeout_cb_arg, this, (now - _pcb_sent_at));

src/AsyncTCP.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,7 @@ class AsyncClient {
160160
AcConnectHandler _poll_cb;
161161
void* _poll_cb_arg;
162162

163-
bool _pcb_busy;
163+
uint32_t _pcb_busy;
164164
uint32_t _pcb_sent_at;
165165
bool _ack_pcb;
166166
uint32_t _rx_ack_len;

0 commit comments

Comments
 (0)