[Iccrg] Question on RFC 2988 - TCP Retransmission timer

Wed Sep 19 22:58:42 BST 2007

Hi Everyone,
	I recently conducted a detailed investigation of the very question
raised in this mail. This work is due to be presented at ICNP 2007. The
paper is located at  
http://www.cs.unc.edu/~rewaskar/publication/ICNP_07.pdf

Objective of this work was to (i) understand the performance of current TCP
deployments, (ii) evaluate different setting for TCP loss detection
mechanism's thresholds (i.e. the RTO and dupack parameter settings), and
(iii) identify the best parameters and calculate the impact of these
choices.

We passively analyze more than 2.8 million real world connections using a
very detailed passive analysis tool named TCPdebug (details at
http://www.cs.unc.edu/~rewaskar/publication/Pam_2006.pdf). For different
parameter settings, rather than simply studying the number of unneeded
retransmission and the saving in detection time, we developed analytical
models to estimate the actual impact in terms of change in connection
duration for a TCP connection.

Based on our experiments we have identified several changes to TCP like -
(i) dynamically adapt dupack threshold based on flight size, (ii) Reduce
minimum retransmission timeouts to 200ms or lower, (iii) make RTO more
aggressive by reducing contribution of RTT variation factor from 4 to 2.

Please feel free to contact us if you have any question on the method or our
findings (I am traveling for a better part of this month and so please
excuse any delay in my responses)

Take care,
Sushant Rewaskar
-----------------------------
UNC Chapel Hill
www.cs.unc.edu/~rewaskar 

-----Original Message-----
From: iccrg-bounces at cs.ucl.ac.uk [mailto:iccrg-bounces at cs.ucl.ac.uk] On
Behalf Of Mark Allman
Sent: Wednesday, September 19, 2007 5:02 PM
To: Ian McDonald
Cc: Iccrg at cs.ucl.ac.uk
Subject: Re: [Iccrg] Question on RFC 2988 - TCP Retransmission timer

[I answered this on the TCPM list where Ian also posted it.  I am just
 now doing a little catchup on the ICCRG list and so I will replay my
 answer here.  --allman]

Ian-

> After some discussion on the Linux networking list I thought I'd ask
> the question here.
> 
> In RFC 2988 Section 2.4 says:
>   (2.4) Whenever RTO is computed, if it is less than 1 second then the
>         RTO SHOULD be rounded up to 1 second.
> 
>         Traditionally, TCP implementations use coarse grain clocks to
>         measure the RTT and trigger the RTO, which imposes a large
>         minimum value on the RTO.  Research suggests that a large
>         minimum RTO is needed to keep TCP conservative and avoid
>         spurious retransmissions [AP99].  Therefore, this
>         specification requires a large minimum RTO as a conservative
>         approach, while at the same time acknowledging that at some
>         future point, research may show that a smaller minimum RTO is
>         acceptable or superior.
> 
> Given that Linux, BSD etc use 200 milliseconds, not 1 second I am
> wondering whether there has in fact been any research done as
> mentioned in last sentence. 

I am not aware of any.

> It seems a very high timeout especially on two locally connected
> devices.

Well, a couple of things .... 

First, if the performance is driven by the magnitude of the RTO then
that is a more general problem than with the min RTO, I think.  We have
devised much better loss recovery than relying on the RTO and so one
would hope that RTOs are rare enough to not be driving performance.

Second, while I don't know of any recent research, I think [AP99] shows
that this is all a tradeoff and so one might want to be careful.  The
current algorithm (RFC 2988) does not often send spurious retransmits.
So, it stands to reason that when simply reducing the min and
calculating the time savings over the current min that things will in
fact be better.  I.e., you will not spend as much time waiting on
timeouts you need to take.  However, the flip side of the coin is that
by reducing the minimum you are also increasing the chances of the RTO
expiring needlessly.  I.e., if you had waited longer for the ACK then
you would not have needed the RTO at all.  The impact of this is harder
to gauge than simply computing the "wait time" because in this case the
cwnd will be needlessly collapsed to 1 segment and the connection will
have to build the window back up.

As an example, from the paper if you simply reduce the min from 1sec to
250msec (as close as we come to the 200msec you cite above) you see that
wait time is reduced by 36% and at the same time we experience 3.6 times
the number of spurious retransmits.

This is all a tradeoff it seems to me.  We could not find a magic 'sweet
spot' that seemed to be a win in terms of wait time and number of
spurious timeouts.

All that said, the data I am quoting is old.  Perhaps on modern networks
a smaller minimum would behave as 1sec did quite a while back.  I have
no idea.  I think it would be nice if someone did a study to find out.

I hope that helps in some way.

allman