[Iccrg] Answer to MulTFRC reviews

Fri Jan 22 00:33:54 GMT 2010

Greetings Michael / all,

A lot of our disagreement relates to the source of loss.  If the only
loss is self-induced loss by long-lived flows, then I agree with many
of your statements.  However loss is often caused by short-lived flows
slow-starting until they force a loss.  I'll explain how that matters
in a couple of the cases below.

2010/1/22 Michael Welzl <michawe at ifi.uio.no>:
> On Jan 21, 2010, at 1:05 AM, Lachlan Andrew wrote:
>> 2010/1/20 Michael Welzl <michawe at ifi.uio.no>:

>> My point was that standard TCP is   sufficiently   aggressive in a LAN
>> environment.
>
> Not necessarily - this depends on the buffer size:

If the BDP is small, the only way for TCP not to get a high throughput
(after slow start) is for the loss rate to be quite high.  If the
buffer isn't enough to prevent loss of the order of one packet in 100,
then making TCP "the same but more aggressive" just makes life worse
for loss-sensitive applications.

If the buffer is sufficiently large, then the limitation on TCP is
from bursty cross-traffic.  Since it only takes TCP a few RTTs (a few
ms on a LAN) to recover a BDP-sized window, I'd argue that TCP is
"sufficiently aggressive" to handle flows which start occasionally.

>> One example where standard TCP is too aggressive is in highly-buffered
>> ADSL links.  (You could argue that the problem is the size of the
>> buffer rather than the fact that the BDP is low, but if the BDP were
>> higher then that size of buffer would be fine.)  The same is true of
>> basically any loss-based algorithm, although Lawrence Stuart here at
>> Swinburne showed that H-TCP's concave increase actually causes lower
>> average queueing than Reno in these cases.
>
> How do you define "too aggressive"?

Nothing precise.  Perhaps "transmitting so many packets into the
network that the incremental damage we do to other traffic is greater
than the incremental benefit to ourselves".  In this case, we get no
additional benefit from keeping the average (not peak) buffer
occupancy high, but we cause problems for others.

> It causes delay, by letting the queue overflow, which, as you
> rightly say, is true for any loss-based algorithm.

Loss only requires the peak buffer occupancy to be high -- not the average.

> Having been
> a part of it at Swinburne myself, I happen to know Lawrence's
> investigation quite well  :-)

True.  It was some gratuitous advertising of Lawrence's work to others
on the list...

> I would argue that, while this is an interesting study, trying
> to optimize a mechanism for it is to optimize it for a very
> poorly tuned special circumstances

You're right that we shouldn't optimise for it, but we should consider
it.  It is just an example of a case when making TCP more aggressive
is not appropriate.

<soapbox>
This relates to a point I made on e2e recently.  People have said that
the strength of IP is that it runs over a wide range of existing
networks -- it is an *inter*network protocol.  These links are out
there.  If I asked for a show of hands of who on this list has an
ADSL/cable connection with a large buffer, I think it would be the
majority.  Buffers don't exist solely for the benefit of TCP -- they
existed before Tahoe -- and I don't see that we in the layer-4
community can ignore the decisions that are being made at other
layers.  If we didn't have the problem of designing congestion control
to work over the network that is "out there", then we'd do the design
very differently.
</soapbox>

> where the only real
> solution is to throw all non-delay-based schemes away.

As an aside, there's a big difference between "delay-based schemes"
and schemes which detect persistently high queueing delay.  I strongly
believe that new-generation congestion control should try to use all
available information, including both loss and delay.  Occasional
peaks in delay may be spurious, and so purely delay-based schemes are
problematic.  However, if the delay is consistently high and reducing
our window does not reduce our throughput, then I think we have no
excuse not to reduce our window.

> Indeed, the buffer size is the problem, and I don't see how
> this fact changes by saying "if the BDP were higher then the
> size of the buffer would be fine" - because it just isn't higher,
> and if it was, the problem would disappear, both with
> MulTFRC and with all other loss-based mechanisms.

The BDP *is* higher on some paths than on others.  The buffer at a
link is always going to be too big for some paths and too small for
others.  We don't need the case to be as extreme as current ADSL links
for that to remain true.

>>> upper limit, which we recommend to have.
>>
>> That limits the scalability.  It may buy us an extra generation of
>> Ethernet (increase aggressiveness by a factor of 10 to match going
>> from GbE to 10GbE), but doesn't address the inherent scalability
>> problem.
>
> I don't get this, most probably it's a misunderstanding.
> A limit of, e.g., N=6 emulated flows, will always give you
> at least 95% link utilization or more, irrespective of the BDP.

We get 95% utilisation if the only loss is self-induced.  However, if
the BDP is large and there a loss because another flow slow-started
and then disappeared, then it will take a long time to recover.

That was always the fundamental problem with Reno on high BDP links,
wasn't it?  Without cross traffic, even small buffers give 75%
utilisation for arbitrarily large BDPs under Reno, but it takes hours
to recover from a few seconds' cross traffic on some hop.

>> Of course, for an experimental RFC it need not be the very best
>> solution, but receiving that stamp is a strong endorsement.  I'd be
>> more in favour of a rate-based version of one of the new-generation
>> algorithms already before the ICCRG (C-TCP, CUBIC or H-TCP) or LEDBAT.
>> Once simulation/test-bed studies have shown which of the four options
>> seems most promising for "new TFRC", we can set the best one loose on
>> the internet.
>
> Now that really makes no sense to me, as MulTFRC is
> not a TCP variant, and by no means meant to be one.
> Being slowly reactive, yet having a smooth sending rate
> (which most TCP applications wouldn't care about),
> it is just not designed to replace Reno, C-TCP, CUBIC,
> H-TCP etc. You're comparing apples with oranges here.

I agree that MulTFRC is not a TCP variant, but it is a congestion
control algorithm.  I though it's aim was to be a rate-based companion
to MulTCP.

My argument is that MulTCP has not yet (to my knowledge) been assessed
by the ICCRG, but that the other congestion control algorithms that I
mentioned have.  If we are going to let a new rate-based congestion
control algorithm loose, we SHOULD (not MUST) choose one based on a
carefully-evaluated window-based algorithm in preference to a
less-well evaluated one.  That is, the draft should make a case why
rate-based "equivalents" of the approaches taken by CUBIC, C-TCP or
H-TCP are not suitable.

I wasn't suggesting that we use the window-based algorithms themselves
in DCCP.  I was suggesting that we do for them what you have done for
MulTCP**, and then compare all of the results by simulation/testbed
before taking the step of endorsing MulTFRC as experimental.

The logic behind TFRC was that the rate-based and window-based flows
should behave similarly in response to network conditions.  Given
that, we should make next-generation rate-based flows behave similarly
to next-generation window-based flows.  Why experiment with MulTFRC if
MulTCP isn't going to be deployed?

**That is, we should say "given this pattern of loss, what would the
window/RTT be?" and set the rate accordingly.

Note that these comments are just my opinions (and those of the
devil's advocate) to stimulate wider debate.  Everyone is welcome to
tear them to pieces, not just Michael.

Cheers,
Lachlan

-- 
Lachlan Andrew  Centre for Advanced Internet Architectures (CAIA)
Swinburne University of Technology, Melbourne, Australia
<http://caia.swin.edu.au/cv/landrew> <http://netlab.caltech.edu/lachlan>
Ph +61 3 9214 4837