Skip to content

Commit c32fde0

Browse files
nealcardwellgregkh
authored andcommitted
tcp: fix tcp_packet_delayed() for tcp_is_non_sack_preventing_reopen() behavior
[ Upstream commit d0fa59897e049e84432600e86df82aab3dce7aa5 ] After the following commit from 2024: commit e37ab7373696 ("tcp: fix to allow timestamp undo if no retransmits were sent") ...there was buggy behavior where TCP connections without SACK support could easily see erroneous undo events at the end of fast recovery or RTO recovery episodes. The erroneous undo events could cause those connections to suffer repeated loss recovery episodes and high retransmit rates. The problem was an interaction between the non-SACK behavior on these connections and the undo logic. The problem is that, for non-SACK connections at the end of a loss recovery episode, if snd_una == high_seq, then tcp_is_non_sack_preventing_reopen() holds steady in CA_Recovery or CA_Loss, but clears tp->retrans_stamp to 0. Then upon the next ACK the "tcp: fix to allow timestamp undo if no retransmits were sent" logic saw the tp->retrans_stamp at 0 and erroneously concluded that no data was retransmitted, and erroneously performed an undo of the cwnd reduction, restoring cwnd immediately to the value it had before loss recovery. This caused an immediate burst of traffic and build-up of queues and likely another immediate loss recovery episode. This commit fixes tcp_packet_delayed() to ignore zero retrans_stamp values for non-SACK connections when snd_una is at or above high_seq, because tcp_is_non_sack_preventing_reopen() clears retrans_stamp in this case, so it's not a valid signal that we can undo. Note that the commit named in the Fixes footer restored long-present behavior from roughly 2005-2019, so apparently this bug was present for a while during that era, and this was simply not caught. Fixes: e37ab7373696 ("tcp: fix to allow timestamp undo if no retransmits were sent") Reported-by: Eric Wheeler <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: Neal Cardwell <[email protected]> Co-developed-by: Yuchung Cheng <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
1 parent ca00f0e commit c32fde0

File tree

1 file changed

+25
-12
lines changed

1 file changed

+25
-12
lines changed

net/ipv4/tcp_input.c

Lines changed: 25 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2441,20 +2441,33 @@ static inline bool tcp_packet_delayed(const struct tcp_sock *tp)
24412441
{
24422442
const struct sock *sk = (const struct sock *)tp;
24432443

2444-
if (tp->retrans_stamp &&
2445-
tcp_tsopt_ecr_before(tp, tp->retrans_stamp))
2446-
return true; /* got echoed TS before first retransmission */
2447-
2448-
/* Check if nothing was retransmitted (retrans_stamp==0), which may
2449-
* happen in fast recovery due to TSQ. But we ignore zero retrans_stamp
2450-
* in TCP_SYN_SENT, since when we set FLAG_SYN_ACKED we also clear
2451-
* retrans_stamp even if we had retransmitted the SYN.
2444+
/* Received an echoed timestamp before the first retransmission? */
2445+
if (tp->retrans_stamp)
2446+
return tcp_tsopt_ecr_before(tp, tp->retrans_stamp);
2447+
2448+
/* We set tp->retrans_stamp upon the first retransmission of a loss
2449+
* recovery episode, so normally if tp->retrans_stamp is 0 then no
2450+
* retransmission has happened yet (likely due to TSQ, which can cause
2451+
* fast retransmits to be delayed). So if snd_una advanced while
2452+
* (tp->retrans_stamp is 0 then apparently a packet was merely delayed,
2453+
* not lost. But there are exceptions where we retransmit but then
2454+
* clear tp->retrans_stamp, so we check for those exceptions.
24522455
*/
2453-
if (!tp->retrans_stamp && /* no record of a retransmit/SYN? */
2454-
sk->sk_state != TCP_SYN_SENT) /* not the FLAG_SYN_ACKED case? */
2455-
return true; /* nothing was retransmitted */
24562456

2457-
return false;
2457+
/* (1) For non-SACK connections, tcp_is_non_sack_preventing_reopen()
2458+
* clears tp->retrans_stamp when snd_una == high_seq.
2459+
*/
2460+
if (!tcp_is_sack(tp) && !before(tp->snd_una, tp->high_seq))
2461+
return false;
2462+
2463+
/* (2) In TCP_SYN_SENT tcp_clean_rtx_queue() clears tp->retrans_stamp
2464+
* when setting FLAG_SYN_ACKED is set, even if the SYN was
2465+
* retransmitted.
2466+
*/
2467+
if (sk->sk_state == TCP_SYN_SENT)
2468+
return false;
2469+
2470+
return true; /* tp->retrans_stamp is zero; no retransmit yet */
24582471
}
24592472

24602473
/* Undo procedures. */

0 commit comments

Comments
 (0)