[Iccrg] Heresy following "TCP: Train-wreck"

Fri Apr 4 14:53:43 BST 2008

Matt,

 From one heretic to another...

0) Priority to shortest jobs
In your bullet list you missed a point that I believe is the most 
important one: about arrivals and departures of whole flows. If you 
have a mix of flow activity at a shared bottleneck, some continuously 
streaming and some intermittent (as we do on the Internet), you can 
make the intermittent flows go much much faster without hardly 
prolonging the completion time of the continuous flows. It's totally 
unfair (and very inefficient) for the intermittent flows to get the 
same bit rate as the continuous flows, because their behaviour is 
much more multiplexable. A picture may help: 
<http://www.cs.ucl.ac.uk/staff/B.Briscoe/presents/0801cfp/shortestJobsPriority.png>

If you're intrigued by how we'd move to such a world, I recently 
posted a bit more on this here: 
<http://episteme.arstechnica.com/eve/forums?a=tpc&s=50009562&f=174096756&m=703000231931&r=160007441931#160007441931>

Now thoughts on your main points:

1) Fair queuing
Many broadband deployments already do some form of per-flow FQ 
(usually WFQ) at the most likely bottleneck (the broadband remote 
access server, BRAS). I'm less sure what common practice is in 
broadband cellular networks. I believe WFQ is not such an obvious 
choice for radio access because of the unpredictable radio link rate. 
I think there are a range of schemes, some distributed at the base 
station and others centralised at the radio network controller (RNC).

Incidentally, back to the first point, I recently realised that most 
deployed per-flow FQ tends to help short duration flows, by giving 
higher priority at the start of a flow and reducing the longer each 
flow continues. Altho this helps a little bit today, it's also 
actually a huge potential problem for the future - same problem as 
TCP: it still converges to the incorrect goal of sharing out bit-rate 
equally for each flow at the bottleneck - tho at least at a BRAS it's 
done separately per each user.

My point about priority to shortest jobs (and your point about huge 
differences between no.s of flows per app) shows that flow rates need 
to be /very/ unequal to be fair. So per-flow FQ embedded in the 
network will be fighting what we really need to do in the transport 
in future (tho operators would obviously turn off per-flow FQ if they 
were trying to encourage the new world I'm talking about).

2) Edge bottlenecks protecting core
Although it would be nice, we can't mandate that the core be 
protected by bottlenecks at the edge. Technology limits & economics 
determine these things, not the IETF/IRTF:
* At present technology trends are moving bottlenecks gradually 
closer to the core as access rates increase, because the newest 
access is now using the same technology as the core (optics) and 
there's nothing faster on the horizon.
* However, economics always pushes the bottleneck location the other 
way - outwards. The cost of a bps of logical channel capacity follows 
about a square root (?) law wrt to the bandwidth of the physical pipe 
in which the logical channel sits. Ie. where you need many physical 
cables/fibres to cover a dispersed area, the cost per bps is much 
greater than where each bps can be provided within a few large pipes.

The net effect of these two opposing forces is pushing bottlenecks 
inwards at the moment - particularly onto border routers. The message 
of a talk I gave to a recent workshop (on photonics research and 
future Internet design) was that the challenge is to find ways to 
push complex trust-related stuff currently done at border routers 
outward to the edge, so we can have dumb all-optical interconnection 
without electronics: 
<http://www.cs.ucl.ac.uk/staff/B.Briscoe/present.html#0709ecoc-fid>

3) RTT fairness
I'd say this is only a small part of the problem, because it's 
relatively easy to solve in the transport alone - e.g. FAST TCP 
[Jin04:FAST_TCP] ensures its dynamics are slower for longer RTTs but, 
even tho it takes longer to get there, it ends up at the same rate as 
competing FAST TCPs with shorter RTTs.

Bob

[Jin04:FAST_TCP] Cheng Jin, David Wei and Steven Low "FAST TCP: 
Motivation, Architecture, Algorithms, Performance", In "Proc. IEEE 
Conference on Computer Communications (Infocomm'04)" (March, 2004)

At 17:35 02/04/2008, Matt Mathis wrote:
>I just attended the "The Future of TCP: Train-wreck or Evolution?" 
>at Stanford last week, and it solidified my thoughts on a subject 
>that is sure to be controversial.
>
>I think it is time to abandon the concept of "TCP-Friendly" and 
>instead expect the network to protect itself and other users from 
>aggressive protocols and applications.  For the moment I am going to 
>assume two mechanisms, although I suspect that there will prove to be more.
>
>1) Deploy some form of Fair Queuing at the edges of the network.
>
>2) Protect the core by bottlenecks at the edges of the Internet.
>
>I observe that both of these mechanisms are already being 
>implemented due to existing market forces, and the natural 
>consequence of their implementation is to make TCP-friendliness a 
>whole lot less important.  I admit that it is not clear at this 
>point if these two mechanisms will ultimately prove to be sufficient 
>to address fairness all situations, such as over loaded core 
>routers, but I suspect that sufficient mechanisms do exist.
>
>Supporting arguments:
>
>FQ is already being deployed at the edges to solve several existing 
>and growing fairness problems:
>
>* Non-IETF, UDP protocols that are non-responsive.
>
>* P2P and other applications that open huge numbers of connections.
>
>* Stock TCP is egregiously unfair when very short RTT flows compete with wide
>   area flows.  This can be a real killer in a number of settings such as data
>   centers and university campuses.  The symptoms of this problem will become
>   more pronounced as TCP autotuning continues to be rolled out in Vista,
>   Linux, and various BSDs.
>
>* Autotuning will also greatly magnify RFC 2309 [1] problems, since every
>   single TCP flow with sufficient data will cause congestion somewhere in the
>   network.  At the very least this will gradually force the retirement of
>   drop-tail equipment, creating the opportunity for RED and/or FQ.  Since RED
>   by itself is insufficient to solve the other fairness problems, it will not
>   be the first choice replacement.
>
>I should note that "Fair Queuing"is overly specific.  The network 
>needs to do something to large flows to prevent them from 
>overwhelming smaller flows and to limit queue occupancy.  FQ is one 
>way, but there are others.
>
>If you have ever shared a drop-tail home router with a teenager, you 
>might have observed some of these issues first hand, as has 
>Comcast.[2] As I understand it, some form of enforced fairness is 
>now part of all commercial broadband services.
>
>The core of the Internet is already mostly protected by bottlenecks 
>at the edges.  This is because ISPs can balance the allocation of 
>revenue from customers between the relatively expensive access link, 
>its own backbone links and interconnections to other ISPs.  Since 
>congestion in the core has proven to cause complaints from 
>commercial customers (and perhaps SLA problems), most providers are 
>careful to keep adequate capacity in the core, and can do so pretty 
>easily, as long as their duty cycle models hold true.
>
>Are these two mechanisms sufficient to make TCP-friendliness 
>completely moot? Probably not.
>
>We still have some work to do:
>
>First, stop whining about non-TCP-friendly protocols.  The are here 
>to stay and they can't hear us.  We are wasting our breath and 
>impeding real progress in well designed alternatives to 
>"TCP-friendly".  This concept came from an era when the Internet was 
>a gentleman's club, but now it needs to be retired.
>
>Second, blame the network when the network deserves it.  In 
>particular if there are drop tail queues without AQM, be very 
>suspicious of RFC2309 problems.  In fact every drop tail queue 
>without AQM should be viewed as a bug waiting to bite 
>someone.  Likewise remember that "TCP-friendly" is extremely unfair 
>when the RTTs are extremely different.
>
>Third, think about the hard cases: over loaded interconnects, 
>failure conditions, etc.  Can FQ be approximated at core 
>scales?  Where else are my proposed mechanisms insufficient?  I sure 
>there are some.
>
>Fourth, start dreaming about what it would take to make Moore's law 
>apply to end-to-end protocol performance, as it does to just about 
>everything else in the computing universe.  I suspect that in some 
>future hindsight, we will come to realize that TCP-friendly was 
>actually a untenable position, and has held us back from important innovations.
>
>[1] RFC2309 "Recommendations on Queue Management and Congestion 
>Avoidance in the Internet", Bob Braden, et al.
>
>[2] Richard Bennett "New and Improved Traffic Shaping" 
>http://bennett.com/blog/index.php/archives/2008/03/27/new-and-improved-traffic-shaping/
>------------------
>
>It was a very stimulating conference!
>Thanks Nandita and everyone else who made it happen!
>--MM--
>-------------------------------------------
>Matt Mathis     http://staff.psc.edu/mathis
>Work:412.268.3319    Home/Cell:412.654.7529
>-------------------------------------------
>
>
>_______________________________________________
>Iccrg mailing list
>Iccrg at cs.ucl.ac.uk
>http://oakham.cs.ucl.ac.uk/mailman/listinfo/iccrg

____________________________________________________________________________
Bob Briscoe, <bob.briscoe at bt.com>      Networks Research Centre, BT Research
B54/77 Adastral Park,Martlesham Heath,Ipswich,IP5 3RE,UK.    +44 1473 645196