The tcpdump log put below is copied from an test I was running recently. At the beginning everything went very smoothly. Then the client side finally overwhelmed a router, then a lot of packets [# - 6176] get dropped(never see ACK for them). Then at 6177 a re-transmission is triggered due to rto timer timed out.
So here are the questions:
- When there is re-transmission, what will happen to sender side congestion window (snd_cwnd)? The os is linux kernel 3.4.42. As is said the snd_cwnd will be reduced to 1 when there is re-transmission. If this is the case, why packet 6179, 6180 can still be sent?
- why 6179, 6180 did not get ACKed? Instead 6178 can get ACKed, means packets can get through.
6174 2.881075 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6379071 Ack=1 Win=13824 Len=1358 TSval=4294945643 TSecr=2532115493
6175 2.881094 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6380429 Ack=1 Win=13824 Len=1358 TSval=4294945643 TSecr=2532115493
6176 2.881114 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6381787 Ack=1 Win=13824 Len=1358 TSval=4294945643 TSecr=2532115493
6177 3.227347 10.203.85.190 207.198.102.53 TCP 1426 [TCP Retransmission] 58206 > 80 [ACK] Seq=5887475 Ack=1 Win=13824 Len=1358 TSval=4294945685 TSecr=2532115493
6178 3.323055 207.198.102.53 10.203.85.190 TCP 68 http > 58206 [ACK] Seq=1 Ack=5888833 Win=980480 Len=0 TSval=2532115623 TSecr=4294945685
6179 3.326368 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6383145 Ack=1 Win=13824 Len=1358 TSval=4294945694 TSecr=2532115623
6180 3.326454 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6384503 Ack=1 Win=13824 Len=1358 TSval=4294945694 TSecr=2532115623
6181 3.727429 10.203.85.190 207.198.102.53 TCP 1426 [TCP Retransmission] 58206 > 80 [ACK] Seq=5888833 Ack=1 Win=13824 Len=1358 TSval=4294945735 TSecr=2532115623
6182 3.813101 207.198.102.53 10.203.85.190 TCP 68 80 > 58206 [ACK] Seq=1 Ack=5890191 Win=980480 Len=0 TSval=2532115746 TSecr=4294945735
6183 3.813606 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6385861 Ack=1 Win=13824 Len=1358 TSval=4294945743 TSecr=2532115746
6184 3.813822 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6387219 Ack=1 Win=13824 Len=1358 TSval=4294945743 TSecr=2532115746
6185 4.197341 10.203.85.190 207.198.102.53 TCP 1426 [TCP Retransmission] 58206 > 80 [ACK] Seq=5890191 Ack=1 Win=13824 Len=1358 TSval=4294945782 TSecr=2532115746
6186 4.294162 207.198.102.53 10.203.85.190 TCP 68 80 > 58206 [ACK] Seq=1 Ack=5891549 Win=980480 Len=0 TSval=2532115866 TSecr=4294945782
6187 4.297450 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6388577 Ack=1 Win=13824 Len=1358 TSval=4294945792 TSecr=2532115866
6188 4.297675 10.203.85.190 207.198.102.53 TCP 1426 58206 > 80 [ACK] Seq=6389935 Ack=1 Win=13824 Len=1358 TSval=4294945792 TSecr=2532115866
This is related to F-RTO, see rfc 5682.
In a traditional algorithm(non F-RTO), it is done as below:
When the retransmission timeout occurs, the TCP sender enters the RTO recovery where the congestion window is initialized to one segment and unacknowledged segments are retransmitted using the slow-start algorithm.
Here is how F-RTO operates:
When the retransmission timer expires, the F-RTO sender retransmits the first unacknowledged segment as usual. Deviating from the normal operation after a timeout, it then tries to transmit new, previously unsent data (usually two segments, if there are enough data and congestion window allows) for the first acknowledgment that arrives after the timeout, given that the acknowledgment advances the window.
So this explains why 6179 and 6180 are sent. For why no ACK received for them, I believe it is some system level bug, and needs to be worked out.