Fwd: [Iccrg] Meeting in Tokyo, with Pfldnet, 20 May

Tue Apr 21 21:19:13 BST 2009

Sorry for the long delay...

On Tue, 7 Apr 2009, Bob Briscoe wrote:

> At 21:08 06/04/2009, Matt Mathis wrote:
>> I suggest that we start this conversation on the list...
>> 
>> As I see it, the first question is just trying to inventory the 
>> dimensionality of the transition space.  Some possible items (not in 
>> order):
>> 
>> Getting the IETF to agree on a new standard ECN
>
> I don't see changing ECN as necessary.

Ok, perhaps I should have worded it "agree to RE-ECN".
Some components might be the same, but the *system* is different.

>> In the network:
>>         Turning on ECN marking
>>         [NOT] Deploying rate fairness
>>         Updating marking algorithms to a new operating regime
>
> We've been at great pains to ensure the evolution path keeps forwarding 
> elements with ECN as is. We have enough problems getting ECN into L2 queues 
> as it is without changing the target mid-stream.

Agreed, however I suspect that people will find that more aggressive transport 
protocols are better behaved with more aggressive router responses.   I think 
this is likely to happen naturally later in the evolution.  At common scales 
I suspect that current ECN marking is just fine, but yes it has to be 
evaluated.

> We had worked out an early version of re-ECN in 2003. But until we worked out 
> a way to retrofit it to the Internet without requiring forwarding elements to 
> be changed, we didn't come to the IETF/IRTF. That's why we first appeared in 
> late 2005, when we had worked out how to introduce re-ECN only requiring a 
> change at the sender.

Note that there is a difference between "not requiring forwarding element 
changes" and being optimal.

> If we have to get a new marking behaviour it will never happen (IMHO). The 
> chances of getting r queues on a path all upgraded if each is upgraded with 
> independent probability p is p^r. Which for small p (say 10%) and even 
> reasonably small r (say 5) is vanishingly small (0.001%).

It is only needed at bottlenecks....

> But anyway, what makes you imagine we need to change network ECN behaviour?
>
>> Updating TCP (etc) receiver behavior in current stacks:
>>         From one ECN signal per RTT to counting signals per RTT
>> Updating TCP (etc) sender behavior in current stacks:
>>         Add RE-ECN marking
>>         Transition Congestion control from 1/sqrt(p) to 1/p
>>              How can we manage the co-existence?
>>         Adding a weight parameter to CC and the TCP API
>>         Getting OS vendors to turn ECN on
>
> I see a much easier transition. Already networks allow all sorts of end 
> system behaviours, while generally limiting users who are heavy overall. They 
> usually define heavy as lots of byte volume at peak times.
>
> For me, there is no problem moving from TCP-friendly to some other definition 
> of fairness, because no *network* uses TCP-friendly as a guiding star anyway. 
> That star was only followed in IETF-land and in the research community, so 
> they are the only lands that need to remove TCP from their mind-set. And, as 
> you heard from the hum in the transport area at San Francisco, those in the 
> room at the IETF seem to have abandoned TCP-friendly already :)
>
> Certainly, there are as many reasons for why no-one hummed for TCP-friendly 
> as there are people. And as many views on what direction we should take 
> instead. But IMHO we don't need to worry about introducing something that's 
> 1/p, rather than 1/sqrt(p), per se.
>
> We do have to avoid introducing something that
> a) continually ratchets up relative to competing transports (no stable 
> equilibrium)
> b) reduces the share of existing TCPs to near-nothing under normal operating 
> conditions.
>
> To me, that's a problem of choice of a constant, not the shape of the 
> response curve.

There is no constant W such that W/p is similar to 1/sqrt(p) across any 
significant portion of the global Internet.  If you know E(sqrt(p)) in some 
local region you can set W to that value (which happens to be reciprocal of 
the expected window size in packets).  As you note this is probably good enough 
for a provider with relatively uniform customers (such as BT), however in the 
global Internet p varies by at least 8 orders of magnitude and the "correct" 
value of W probably spans 4 orders of magnitude.  Picking a 
"one-size-fits-all" value doesn't work.

If you add an adaptive controller to estimate W at run time, then yes indeed 
you might be able to mimic the response of AIMD(1,0.5) congestion control. 
However, since the window adjustments (and losses) are constrained to be 
integral numbers of packets, the effective sampling frequency would drop to be 
the same as AIMD. (This is where the fluid approximation becomes useless). 
The whole point of 1/p congestion control is to get sampling frequency up high 
enough to be useful.  In the region where AIMD congestion control has an 
unreasonable sampling frequency it will be crushed by any loss based 
congestion control that has a reasonable sampling frequencies if the network 
sends the same signals (per packet independent) to each.  (Strictly speaking 
it comes into equilibrium, not "ratcheting", but the distinction is likely to 
be lost on the users).

>> Note that several of these should be more finely subdivided, and I am 
>> missing a bunch such as enforcing RE-ECN at borders as well as other 
>> potential inhibitory deployments.
>> 
>> Bob can you extend the list?
>
> I'd rather start a route plan that sets off in a different initial direction 
> to get to the same eventual end point. It starts in the policing direction 
> the industry is already taking. It allows a lot of leeway and slop on the 
> way.
>
> 1/ Get operators to understand the benefits of moving from policing volume to 
> policing congestion-volume, initially by counting congestion locally at 
> congested resources (ie without transparency of policing to the end-user, 
> which they don't care about now anyway). They can make this move 
> unilaterally.
>
> We're starting this in the MIT-based group I mentioned.

This is profoundly important if it can be done entirely locally.  This 
device/function needs a crisp name and to be written up really clearly for 
several different audiences.  Note that deploying it actually entails several 
sub-steps: not only convincing the operators that it is correct, but to put 
it in RFQ's with enough weight to have the router vendors build the gear, and 
then to actually deploy it.

I'd really like to see this algorithm.  Is it patented?  How much state is 
needed?  Can it be implemented at highly aggregated core routers?  How would 
it respond to relentless-TCP?

It is not my intent to prescribe what the network has to do, only that it does 
something to enforce some form of fairness.  I use WFQ etc for illustration 
because they are known by the community and easy to understand.

> 2/ Develop new (1/p) transports with weight (constant of proportionality) 
> set low, to be close to equal rate with a competing TCP flow under 'average' 
> conditions.

Such a constant doesn't exist for the reasons I indicated above.  Average for 
DSL != average for DOCSYS != average for FIOS.  Also average for GB != average 
for USA != average for Korea.

> Current volume policing will approximately incentivise these transports not 
> to set their weight too high without some special arrangement (eg. paying 
> more), merely because they will otherwise transfer lots of volume over time 
> and get stopped.

I fail to see how this adaptation will take place.   The end users can't tune 
their systems.  The application developers can't anticipate where their users 
are or what their network looks like.  If you make W adaptive, then the system 
reverts to 1/sqrt(p), with all of the problems that we want to avoid.

> Where there are special arrangements (experiments and real life), these 
> transports can be used for hi-speed by setting their weight to where you 
> would want it set eventually. They will create a lot more loss/ECN, and go a 
> lot faster, than competing TCPs, but that's the intention - only for where 
> there's plenty of capacity that TCP isn't using well.

This I agree with.

> On low capacity paths where such hi-speed isn't appropriate, it's up to 
> operators to limit users, which they have to do today - based on volume or 
> per-user rate.

Right - I'd really like to see an algorithm to do this based on congestion 
volume.   But somehow this feels circular to me.

> 3/ A missing piece is a signal from policer to transport, so the transport 
> knows when loss is because it has tripped a policing trigger (or close to it) 
> rather than just experiencing congestive loss.
>
> I've recently worked out half an idea for an in-band signal, but it uses a 
> re-ECN codepoint - we'd need something for today first. Preferably in-band, 
> so it just works. I think Dave Oran has ideas in this space too.

So this algorithm orthogonal to RE-ECN.  RE-ECN permits the policing to be 
done anywhere in the network and allows aggregation, but is independent of the 
design of the congestion controller.  Correct?

> 4/ In parallel, define an experimental extension header for re-ECN in IPv6 
> only.
> Reason: experiments are no longer allowed with IPv4 header fields (RFC4727), 
> even tho the specific field re-ECN needs (the evil bit) isn't mentioned.

I guess that depend on how you interpret 4827.

> 5/ Deploy experimental boundary policers using the congestion-volume info 
> provided by re-ECN (v6)
>
> As operators protect off their v6 networks with edge congestion policers, 
> they can remove all the other interim ad hoc policing stuff:
> - volume crossing the trust boundary
> - rate crossing the trust boundary
> - congestion at specific resources
>
> The main difference is that policing moves to being based on a transparent 
> metric that all sides can see and trust (re-ECN marks). And the metric 
> represents marginal cost of usage rather than very poor ad hoc 
> approximations. A competitive market is trying to drive providers to cost, so 
> this will be a natural end-point that the invisible hand of the market will 
> want to drive the Internet towards.

Yes, one would hope that deploying gear to do it right will cause older, less 
correct equipment to be retired.  The critical question is how much does the 
incorrect gear cause barriers to deployment for the newer algorithms?  Can 
the new and the old completely coexist?

> 6/ Apps find they can start increasing the weight of the new transports 
> defined at #2 without tripping off operator alarms (#3). So they do.

This step will be tested continuously, starting yesterday.

> 7/ If the experiments in v6 are successful, one assumes it will become de 
> facto used in v6. Then we need to define re-ECN for IPv4 too.

I thought you said you didn't want to change ECN?   :-)

> 8/ Let new cogestion controls evolve however the hell they want within the 
> new incentive framework that encourages responsiveness to congestion and 
> enforces it in extremum. There will be good reason for transports to respond 
> as (1/p), but we don't have to mandate that. Any response (1/p^b) is OK (0 < 
> b <= 1). We want to allow new apps to evolve in the mix. But the IETF/IRTF 
> needs to define best practice and some good generic transport(s).

> As you can see, overall this is a more liberal path, which I believe better 
> matches the way networks and their customers will push each other naturally.

I think your interpretation of it being more liberal is only due to missing 
details in both..

> My working model is that no-one will do what the IETF mandates, but they 
> might use what the IETF provides if it is in their interests for 
> interworking. The IETF only provides the tools for the battle, not the 
> behaviour guidelines.

I would concur with this.

>> At this stage I would add items (and not replace existing items) even if 
>> they are not orthogonal.  It would be better to make the coverage as 
>> complete as possible even if some of the items are redundant.
>> 
>> One task for the meeting would trying to imagine a partial ordering on the 
>> smaller steps, such that each one is motivated by the preceding steps, and 
>> not excessively inhibited by any negating technology.

> We have different views on the ordering of the big steps to deal with first 
> :)

No, deciding which steps to do first presupposes the outcome.  I would rather 
start from an inventory of all possible first (and other) steps and then 
as a separate process weed out the ones that lead to bad outcomes or are 
infeasible.   This is in the style of brainstorming, where know bad ideas are 
not discarded until later in the process, because sometimes they stimulate 
additional good ideas.

>> Thanks,
>> --MM--
>> -------------------------------------------
>> Matt Mathis     http://staff.psc.edu/mathis
>> Work:412.268.3319    Home/Cell:412.654.7529
>> -------------------------------------------
>> Evil is defined by mortals who think they know
>> "The Truth" and use force to apply it to others.
>> 
>> On Thu, 26 Mar 2009, Bob Briscoe wrote:
>> 
>>> Matt,
>>> 
>>> It would be useful to get a 'design team' together to thrash out the 
>>> roadmap to get from the death of TCP-friendliness to somewhere else.
>>> - You have your path.
>>> - I have mine.
>>> - Perhaps others have theirs.
>>> - They all have common points and differences.
>>> 
>>> A roadmap is about all the places we could visit and which ones are places 
>>> of interest and which ones are shit-holes. It's not a route-plan. We don't 
>>> all have to agree on the destination, or on the path. Just the pits to 
>>> avoid and the high points we don't want to block off. For instance, 
>>> identifying that travelling via place M precludes getting to place Z.
>>> 
>>> Would Tokyo be a good venue? I don't know who will be there. Or should we 
>>> wait until an ICCRG co-located with an IETF? Or should we say Tokyo is 
>>> where the discussion will start, so come if you want to start the journet? 
>>> I would come if we did that. But currently I have no plans to be in Tokyo, 
>>> just because I'm travelling too much.
>>> 
>>> 
>>> Bob
>>> 
>>> 
>>>> From: Michael Welzl <michael.welzl at uibk.ac.at>
>>>> To: iccrg at cs.ucl.ac.uk
>>>> Organization: University of Innsbruck
>>>> Date: Tue, 24 Mar 2009 17:17:43 +0100
>>>> X-Mailer: Evolution 2.8.3 (2.8.3-2.fc6)
>>>> X-Spam-Score: 0.001 () UNPARSEABLE_RELAY
>>>> X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
>>>> X-Scanned-By: MIMEDefang 2.61 at uibk.ac.at on 138.232.1.140
>>>> X-UCL-MailScanner-Information: Please contact the UCL Helpdesk,
>>>>         helpdesk at ucl.ac.uk for more information
>>>> X-MailScanner-ID: 1Lm9K4-00056n-Au
>>>> X-UCL-MailScanner: Found to be clean
>>>> X-UCL-MailScanner-From: michael.welzl at uibk.ac.at
>>>> X-Spam-Status: No
>>>> Subject: [Iccrg] Meeting in Tokyo, with Pfldnet, 20 May
>>>> X-BeenThere: iccrg at cs.ucl.ac.uk
>>>> X-Mailman-Version: 2.1.5
>>>> List-Id: "Discussions of Internet Congestion Control Research Group 
>>>> \(ICCRG\)"
>>>>         <iccrg.cs.ucl.ac.uk>
>>>> List-Unsubscribe: <http://oakham.cs.ucl.ac.uk/mailman/listinfo/iccrg>,
>>>>         <mailto:iccrg-request at cs.ucl.ac.uk?subject=unsubscribe>
>>>> List-Archive: <http://oakham.cs.ucl.ac.uk/pipermail/iccrg>
>>>> List-Post: <mailto:iccrg at cs.ucl.ac.uk>
>>>> List-Help: <mailto:iccrg-request at cs.ucl.ac.uk?subject=help>
>>>> List-Subscribe: <http://oakham.cs.ucl.ac.uk/mailman/listinfo/iccrg>,
>>>>         <mailto:iccrg-request at cs.ucl.ac.uk?subject=subscribe>
>>>> Sender: iccrg-bounces at cs.ucl.ac.uk
>>>> X-OriginalArrivalTime: 24 Mar 2009 19:06:08.0437 (UTC) 
>>>> FILETIME=[96B57E50:01C9ACB3]
>>>> X-EsetId: C30EC8256EB3173D9542
>>>> Dear all,
>>>> As announced quite some time ago, and not mentioned again
>>>> for a long time - sorry!! - we will have the next ICCRG
>>>> meeting in Tokyo, co-located with PFLDNet; details:
>>>> http://www.hpcc.jp/pfldnet2009/Top.html
>>>> I will be your host  :)
>>>> Please send me suggestions for agenda items to Wes
>>>> and me ASAP so that we can start building it. Thanks!
>>>> Cheers,
>>>> Michael
>>>> 
>>>> _______________________________________________
>>>> Iccrg mailing list
>>>> Iccrg at cs.ucl.ac.uk
>>>> http://oakham.cs.ucl.ac.uk/mailman/listinfo/iccrg
>>> 
>>> ________________________________________________________________
>>> Bob Briscoe,               Networks Research Centre, BT Research
>> 
>> ________________________________________________________________
>> Bob Briscoe,               Networks Research Centre, BT Research