[Iccrg] CUBIC I-D feedback from an implementor's perspective

Thu Oct 30 16:31:52 GMT 2008

Thanks, Lawrence!

Please see my comments below.
Lisong

Lawrence Stewart wrote:
> Hi All,
> 
> I've been working on and off independently implementing and testing some
> of the proposed congestion control algorithms in FreeBSD as part of our
> NewTCP research project [1]. So far, I've completed a functional HTCP
> and I'm currently working on CUBIC with others planned as future work. I
> have some feedback for the HTCP and CUBIC draft authors based on my
> implementation experiences. Below are my current feedback notes for CUBIC.
> 
> All comments below relate to draft-rhee-tcpm-cubic-01.txt
> 
> 
> Units for variables
> -------------------
> 
> The correct units for variables appear to be ambiguous. (For example, I
> presume cwnd is in pkts, and time t is in seconds - but I couldn't
> find that clearly stated in the I-D.)
> 
> For example in the BSD stack, cwnd is calculated and used in
> bytes, and time is calculated in terms of kernel ticks. However putting
> cwnd in bytes into the CUBIC equations produces results that I believe
> are demonstrably incorrect, whereas using pkts seems to produce expected
> results. Similar problems arise using ticks instead of regular time.
> 
> It might even be worth providing equivalent functions for both pkt and
> byte based cwnd so that the implementor doesn't have to think too hard.
> At the moment, I'm just converting to pkts by dividing cwnd by smss, but
> it's not ideal. And again, similar issues exist using ticks for time.
> 

The unit of Cwnd does not matter for any equation in Section 3. When we 
give the specific example of Cwnd in Section 4, such tables 1 and 2, the 
unit of Cwnd is MSS, as stated in the tables.

> 
> Convention for "beta"
> ---------------------
> 
> The I-D does not use the convention that beta represents the
> multiplicative decrease factor (i.e. cwnd = cwnd * beta on congestion).
> It instead uses beta in the following context: cwnd = cwnd * (1-beta) on
> congestion.
> 
> It took me a while to wrap my head around the I-D's use of the term 
> "multiplicative decrease factor" for beta in this context. I would 
> suggest that it would reduce confusion and implementation errors to 
> either refer to beta in the more usual context i.e. make it 0.8 and 
> modify the I-D's equations appropriately, or change the variable in the 
> I-D to be called something else.
> 
> If the community feels the draft's use of beta is appropriate, then
> perhaps we should standardise the meaning of "beta" somewhere for people
> writing/reading congestion control related documents to cite/refer to so 
> as to avoid each draft having its own interpretation.
> 

You are right that maybe we should standardize it. I myself is sometimes 
confused with the definition of this "multiplicative decrease factor" :-)

> 
> Overloaded variables
> --------------------
> 
> There appears to be overloaded meaning given to some variables. For
> example, as I read it, "beta" in equation 2 is supposed to be cubic beta
> (0.2), but equation 3 refers to a generic beta and in equation 4 beta
> represents reno beta (0.5).
> 
> I would strongly suggest that the draft explicitly differentiate the
> references to beta (e.g. use "cubic_beta" and "reno_beta") and the
> values they should take.
> 
Yes, this is a good suggestion.

> 
> Parentheses in equations
> ------------------------
> 
> I'd suggest adding parentheses to equations (e.g. 3 and 4 in particular)
> to completely rule out any possibility of ambiguous interpretation. For
> example with equation 4, is the last term [(3*beta)/(2-beta)] * (t/RTT)
> or is it (3*beta)/[(2-beta) * (t/RTT)]?
> 

    (alpha/2 * (2-beta)/beta * 1/p)^0.5 (Eq. 3)
is
    [(alpha/2) * ((2-beta)/beta) * (1/p)]^0.5 (Eq. 3)

    W_tcp(t) = W_max*(1-beta) + 3*beta/(2-beta)* t/RTT (Eq. 4)
is
    W_tcp(t) = W_max*(1-beta) + [3*beta/(2-beta)]*(t/RTT) (Eq. 4)

> 
> Explicit help for implementors
> ------------------------------
> 
> I'd suggest adding a new (sub)section somewhere in the document
> focusing on implementation related discussion.
> 
> Calculating the cubic root of a number in the kernel is a non trivial
> task because you need to be able to do it using fixed point math.
> 
> I've nutted out a way to do it but I think this is something algorithm
> authors should be helping people with in the I-D. For CUBIC, we could
> standardise some code fragments that do the necessary calculations and
> insert them as examples in the I-D to give implementors a starting point.
> 

Our original CUBIC code used the bisection method to calculate the cubic 
root. Later it was replaced by a Newton-Raphson method with table 
loopups for small values. This results in more than 10 times performance 
improvement in the cubic root calculation. On average, the bisection
method costs 1032 clocks while the improved version costs only 79 clocks.

You can find more information at
* Hemminger, S. Cubic root benchmark code. 
http://lkml.org/lkml/2007/3/13/331
* Tarreau, W. Cubic optimization.
http://git.kernel.org/?p=linux/kernel/git/davem/net2.6.git;a=commit;h=7e58886b45bc4a309aeaa8178ef89ff767daaf7f 

> 
> TCP friendly region
> -------------------
> 
> The TCP friendly region (section 2.2) concept is still fuzzy for me.
> During which periods of a congestion epoch would one expect the TCP
> friendly region to actually have any effect?
> 
> I think this concept and its importance needs to be explained more
> clearly in the document.
> 

I see. As described in Section 4.1, TCP friendly region has an effect 
when standard TCP performs well. For example, in the following two types 
of networks:
       1. networks with a small bandwidth-delay product (BDP).
       2. networks with a short RTT, but not necessarily a small BDP

> 
> Fast convergence
> ----------------
> 
> Why is the fast convergence heuristic so brutal in it's attempted
> detection of new flows starting up?
> 
> Having the hard check of "is cwnd < cwnd_prev" in place seems far too
> black and white, particularly in BSD where cwnd is in bytes, so a 1 byte
> difference will trigger the more aggressive backoff.
> 
> Perhaps a "is cwnd < 0.97*prev_cwnd" or similar check would soften the
> edge just a little bit?
> 

Actually we have tested something like "cwnd < 0.97*prev_cwnd", but 
there is no obvious difference. This is why we are still using it.

Thanks, Again!

Lisong

-- 
Lisong Xu, Assistant Professor
Computer Science & Engineering
University of Nebraska-Lincoln
http://cse.unl.edu/~xu