[Opensim-dev] Behaviour of adaptive throttles under high load

Mic Bowman cmickeyb at gmail.com
Mon Dec 1 16:46:13 UTC 2014


and, just to be clear...

did you have *both* adaptive and total bw throttles turned on?

the interaction between the two through the hierarchical token bucket is
another place where i was more than a little worried. i tested that with
network emulators under high load & it seemed to do what it was supposed to
do, but i wouldn't be surprised to find a timing issue.

--mic


On Mon, Dec 1, 2014 at 8:42 AM, Mic Bowman <cmickeyb at gmail.com> wrote:

> one thing that i was concerned about when i put the throttles in place is
> the relationship between congestion control and packet sizes. if you're
> generating a large number of small, reliable packets that are being
> dropped, that could cause the congestion control to kick in more quickly.
> that would suggest an adjustment based on bytes sent rather than time
> (though both are probably appropriate).
>
> my biggest concern is that we start fixing by "stabbing in the dark".
> congestion control is particularly nasty in how it interacts which is why i
> started with a well known & long battle tested algorithm. making random
> changes might fix one problem and introduce a half dozen others.
>
> i'm not in a position to help on the diagnosis until next week if you can
> wait until then.
>
> --mic
>
>
> On Wed, Nov 26, 2014 at 4:04 PM, Justin Clark-Casey <
> jjustincc at googlemail.com> wrote:
>
>> This was actually happening at quite low loads (< 40 connections over all
>> 4 keynotes).  Once adaptive throttles was disabled and other unrelated
>> issues fixed the system had no obvious issues coping with higher loads in
>> both testing and the conference itself (e.g. the 159 peak keynote avatars
>> in the conference).  So I don't think it was a server bandwidth issue.
>>
>> That said, it was somewhat strange behaviour as affected only maybe
>> 10-20% of connections.  Once it did affect a connection (I saw this
>> happening by logging downward adjustments which one can still do with the
>> console command "debug lludp throttles log 1"), the connection would not
>> recover - at some point a bunch of expires would reduce the throttle
>> again.  Connections seemed to be affected randomly - I experienced the
>> issue myself at one point and I have pretty solid fibre.
>>
>> You're right in that I don't know why this happened or why problematic
>> connections stayed problematic instead of slowly recovering.  Because of
>> time constraints we had to disable adaptive instead of investigating
>> further.  But I don't advocate doing this by default at all because, as you
>> say, it's an important mechanism for congestion control.
>>
>> I do plan further investigation will happen at some point but it's time
>> consuming work and I'd really love to get a release out soon-ish.  So for
>> the moment I would like to do tune the adapation mechanism tuning as you've
>> mentioned, which I believe should probably be done anyway.  Because of the
>> nature of the problem, my plan would be not to change the adaption divisor
>> but rather to adapt downwards only every 2 seconds or so if packets are
>> expiring rather than on every packet expire.  I believe this should still
>> achieve the adaption effect without massively penalising the connection if
>> there has been a momentary connection issue or similar.
>>
>> On 26/11/14 02:39, Mic Bowman wrote:
>>
>>> As you mention... cutting the throttle by 50% was modeled on the TCP
>>> congestion control approach. It is very aggressive
>>> as a congestion control mechanism and certainly could be tuned.
>>>
>>> That being said... do you know why the packets were considered un-acked?
>>> If its because the simulator is having problems
>>> (which given your description that it happens under load seems to be the
>>> case) then we can probably do something more
>>> intelligent about throttling over all simulator BW. That is... maybe the
>>> problem is that the top end of the overall
>>> simulator bw is the problem, not the per connection throttles.
>>>
>>> Manual throttles & adaptive throttles are not exclusive. You can use
>>> both. Adaptive manages the top end, but the manual
>>> throttles set an absolute max.
>>>
>>> --mic
>>>
>>>
>>> On Tue, Nov 25, 2014 at 5:15 PM, Justin Clark-Casey <
>>> jjustincc at googlemail.com <mailto:jjustincc at googlemail.com>> wrote:
>>>
>>>     Hi Mic (primarily),
>>>
>>>     Two years ago [1] we had a discussion about the
>>> enable_adaptive_throttles setting.  Just for background, this is a
>>>     setting that adapts the amount of data sent to the viewer depending
>>> on whether reliable packets sent from the
>>>     simulator are acked or not.  As such, it looks to make sure that a
>>> viewer which sets a downstream bandwidth higher
>>>     than its network connection can cope with is not permanently hosed
>>> with too much data.  We enabled it on an
>>>     experimental basis [2].
>>>
>>>     As you said at the time, this is modelled on the congestion approach
>>> used in TCP.  I see that for TCP, the rate is
>>>     halved on every unacked segment.  In OpenSimulator, it's halved on
>>> every unacked reliable packet.
>>>
>>>     However, under fairly modest load conditions in the conference grid,
>>> I saw a behaviour where sometimes for a
>>>     connection a sequence of packets would expire for some connections
>>> in a very short time period (< 1 sec).  This
>>>     would halve the throttle many times, in my observations right down
>>> to the absolute minimum.  This caused the
>>>     behaviour from the user's point of view to degrade considerably for
>>> an extended period of time.  The throttles takes
>>>     quite a long period to grow again.
>>>
>>>     I didn't get much further with the diagnostics since a lack of time
>>> forced us to switch back to manual throttling
>>>     instead (with a 1 mbit per viewer and 400 mbit total on the
>>> keynotes).  This seemed to work okay in testing and in
>>>     the event itself.  However, this leaves one vulnerable to the
>>> problem adaptive_throttles looks to tackle in the
>>>     first place.
>>>
>>>     I'm still reading up about this stuff, but it strikes me that
>>> halving the throttle on every missed packet is much
>>>     harsher than the TCP approach, as with UDP a whole sequence can
>>> expire at once rather than a single segment that is
>>>     subsequently retried before another segment can be missed.
>>>
>>>     One idea is to ignore all expiries in a certain period (e.g. next 2
>>> seconds) if an expired packet has already caused
>>>     the throttle to be halved.  Of course, this is a bit more
>>> complicated to do but hopefully not too much so.  What do
>>>     you think?  Any other ideas?
>>>
>>>     [1] http://opensimulator.org/__pipermail/opensim-dev/2011-__
>>> October/023017.html
>>>     <http://opensimulator.org/pipermail/opensim-dev/2011-
>>> October/023017.html>
>>>     [2] http://opensimulator.org/__pipermail/opensim-dev/2011-__
>>> October/023063.html
>>>     <http://opensimulator.org/pipermail/opensim-dev/2011-
>>> October/023063.html>
>>>
>>>     Best Regards,
>>>
>>>     --
>>>     Justin Clark-Casey (justincc)
>>>     OSVW Consulting
>>>     http://justincc.org
>>>     http://twitter.com/justincc
>>>     _________________________________________________
>>>     Opensim-dev mailing list
>>>     Opensim-dev at opensimulator.org <mailto:Opensim-dev at opensimulator.org>
>>>     http://opensimulator.org/cgi-__bin/mailman/listinfo/opensim-__dev
>>>     <http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Opensim-dev mailing list
>>> Opensim-dev at opensimulator.org
>>> http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev
>>>
>>>
>>
>> --
>> Justin Clark-Casey (justincc)
>> OSVW Consulting
>> http://justincc.org
>> http://twitter.com/justincc
>> _______________________________________________
>> Opensim-dev mailing list
>> Opensim-dev at opensimulator.org
>> http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://opensimulator.org/pipermail/opensim-dev/attachments/20141201/34015f27/attachment.html>


More information about the Opensim-dev mailing list