yes, which is why I said discard them when new updates occur.<br><br><div class="gmail_quote">On Mon, Mar 28, 2011 at 12:03 PM, Melanie <span dir="ltr"><<a href="mailto:melanie@t-data.com">melanie@t-data.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">For avatars yes. But prim updates can never be discarded, no matter<br>

how trivial, because they establish new persistent state.<br>

<font color="#888888"><br>

Melanie<br>

</font><div><div></div><div class="h5"><br>

Dahlia Trimble wrote:<br>

> the viewer discards small changes anyway if avatar imposters are enabled<br>

><br>

> On Mon, Mar 28, 2011 at 11:54 AM, Melanie <<a href="mailto:melanie@t-data.com">melanie@t-data.com</a>> wrote:<br>

><br>

>> No, we can't discard small changes. As the avatar comes closer, they<br>

>> would be seen out of place, e.g. someone building in the distance<br>

>> would move prims and then you come closer to look and all prims<br>

>> would be out of place.<br>

>><br>

>> Melanie<br>

>><br>

>> Dahlia Trimble wrote:<br>

>> > a couple thoughts..<br>

>> ><br>

>> > Perhaps resend timeout period could be a function of throttle setting<br>

>> and/or<br>

>> > measured packet acknowledgement time per-client? (provided we measure<br>

>> it).<br>

>> > That may prevent excessive resend processing that may not be necessary.<br>

>> ><br>

>> > On the distance prioritization, could small changed in object<br>

>> translations<br>

>> > be discarded from the prioritization queues/resend buffers for distant<br>

>> > objects when new updates occur for those objects? Small changes may not<br>

>> be<br>

>> > noticeable from the viewer perspective anyway.<br>

>> ><br>

>> ><br>

>> > On Mon, Mar 28, 2011 at 10:48 AM, Teravus Ovares <<a href="mailto:teravus@gmail.com">teravus@gmail.com</a>><br>

>> wrote:<br>

>> ><br>

>> >> Here are a few facts that I've personally discovered while working<br>

>> >> with LLClientView.<br>

>> >><br>

>> >> 1. It has been noted that people with poor connections to the<br>

>> >> simulator do consume more bandwidth, cpu, and have a generally worse<br>

>> >> experience.   This has been tested and profiled extensively.    This<br>

>> >> may seem like a small issue because what it's doing is so basic...<br>

>> >> however the frequency in which this occurs is a real cause of<br>

>> >> performance issues.<br>

>> >><br>

>> >> 2. It's also noted that the CPU used in these cases reduces the CPU<br>

>> >> available to the rest of the simulator resulting in a lower quality of<br>

>> >> service for the rest of the people on the simulator.<br>

>> >> This has been seen in the profiling and has been qualitatively<br>

>> >> observed by a large number of users connected and everything is OK and<br>

>> >> then a 'problem connection' user connecting causing a wide range of<br>

>> >> issues.<br>

>> >><br>

>> >> 3. It's also noted that lowering the outgoing UDP packet throttles<br>

>> >> beyond a certain point results in perpetual queuing and resends.<br>

>> >> This was tested by using a throttle multiplier last year that was<br>

>> >> implemented by justincc.  I'm not sure if the multiplier is still<br>

>> >> there.   It's most easily seen with image packets.   Again, I note<br>

>> >> that the packets are not rebuilt going from the regular outbound queue<br>

>> >> to the resend queue.    The resend queue is /supposed/ to be used to<br>

>> >> quickly get data that is essential to the client after attempting to<br>

>> >> send once already.   The UDP spec declares the maximum resend to be 2<br>

>> >> times, however there has been some considerable debate on whether or<br>

>> >> not OpenSimulator should follow that specific specification item<br>

>> >> leading to a configuration option to enable perpetual resends<br>

>> >> (Implemented by Melanie).  The configuration item was named similar<br>

>> >> to, 'reliable is important' or something like that.   I'm not sure if<br>

>> >> the configuration item survived the many revisions however I suspect<br>

>> >> that it did.<br>

>> >><br>

>> >> 4. It's also noted that raising the packet throttles beyond what the<br>

>> >> connection can support results in resending almost every packet the<br>

>> >> maximum amount of times before the limit is reached.<br>

>> >> This is easily reproducible by setting the connection (in the client)<br>

>> >> to the maximum and connecting to a region that you've never been to<br>

>> >> before on a sub par connection.   Before the client adjusts and<br>

>> >> requests a lower throttle setting there's massive data loss and<br>

>> >> massive re-queuing.<br>

>> >><br>

>> >> 5. The client tries to adjust the throttle settings based on network<br>

>> >> conditions.   This can be observed by monitoring the packet that sets<br>

>> >> the throttles and dragging the bar to maximum.   After a certain<br>

>> >> amount of resends, the client will call the set throttle packet with<br>

>> >> reduced settings (some argue that it doesn't do that fast enough).<br>

>> >><br>

>> >> 6. A user who has connected previously to the simulator will use less<br>

>> >> resources then a user who has never connected to the simulator.  (this<br>

>> >> is mostly because of the image cache on the client).    Any client<br>

>> >> that uses CAPS images will use less resources then one that uses<br>

>> >> LLUDP.<br>

>> >><br>

>> >> When working with the packet queues, it's essential to understand<br>

>> >> those 6 observations.   Even though, the place where you tend to see<br>

>> >> the issues with queuing is the image queue over LLUDP, the principles<br>

>> >> apply to all of the udp queues.<br>

>> >><br>

>> >> Regards<br>

>> >><br>

>> >> Teravus<br>

>> >><br>

>> >><br>

>> >> On Mon, Mar 28, 2011 at 1:00 PM, Mic Bowman <<a href="mailto:cmickeyb@gmail.com">cmickeyb@gmail.com</a>> wrote:<br>

>> >> > Over the last several weeks, Dan Lake & I have been looking some of<br>

>> the<br>

>> >> > networking performance issues in opensim. As always, our concerns are<br>

>> >> with<br>

>> >> > the problems caused by very complex scenes with very large numbers of<br>

>> >> > avatars. However, I think some of the issues we have found will<br>

>> generally<br>

>> >> > improve networking with OpenSim. Since the behavior represents a<br>

>> fairly<br>

>> >> > significant change in behavior (though the number of lines of code is<br>

>> not<br>

>> >> > great), I'm going to put this into a separate branch for testing<br>

>> (called<br>

>> >> > queuetest) in the opensim git repository.<br>

>> >> > We've found several problems with the current<br>

>> >> > networking/prioritization code.<br>

>> >> > * Reprioritization is completely broken for SceneObjectParts. On<br>

>> >> > reprioritization, the current code uses the localid stored in the<br>

>> scene<br>

>> >> > Entities list but since the scene does not store the localid for SOPs,<br>

>> >> that<br>

>> >> > attempt always fails. So the original priority of the SOP continues to<br>

>> be<br>

>> >> > used. This could be the cause of some problems since the initial<br>

>> >> > prioritization assumes position 128,128. I don't understand all the<br>

>> >> possible<br>

>> >> > ramifications, but suffice it to say, using the localid is causing<br>

>> >> > problems.<br>

>> >> > Fix: The sceneentity is already stored in the update, just use that<br>

>> >> instead<br>

>> >> > of the localid.<br>

>> >> > * We currently pull (by default) 100 entity updates from the<br>

>> entityupdate<br>

>> >> > queue and convert them into packets. Once converted into packets, they<br>

>> >> are<br>

>> >> > then queued again for transmissions. This is a bad thing. Under any<br>

>> kind<br>

>> >> of<br>

>> >> > load, we've measured the time in the packet queue to be up to many<br>

>> >> > hundreds/thousands of milliseconds (and to be highly variable). When<br>

>> an<br>

>> >> > object changes one property and then doesn't change it again, the time<br>

>> in<br>

>> >> > the packet queue is largely irrelevant. However, if the object is<br>

>> >> > continuously changing (an avatar changing position, a physical object<br>

>> >> > moving, etc) then the conversion from a entity update to a packet<br>

>> >> "freezes"<br>

>> >> > the properties to be sent. If the object is continuously changing,<br>

>> then<br>

>> >> with<br>

>> >> > fairly high probability, the packet contains old data (the properties<br>

>> of<br>

>> >> the<br>

>> >> > entity from the point at which it was converted into a packet).<br>

>> >> > The real problem is that, in theory, to improve the efficiency of the<br>

>> >> > packets (fill up each message) we are grabbing big chunks of updates.<br>

>> >> Under<br>

>> >> > load, that causes queuing at the packet layer which makes updates<br>

>> stale.<br>

>> >> > That is... queuing at the packet layer is BAD.<br>

>> >> > Fix: We implemented an adaptive algorithm for the number of updates to<br>

>> >> grab<br>

>> >> > with each pass. We set a target time of 200ms for each iteration. That<br>

>> >> > means, we are trying to bound the maximum age of any update in the<br>

>> packet<br>

>> >> > queue to 200ms. The adaptive algorithm looks a lot like a TCP slow<br>

>> start:<br>

>> >> > every time we complete an iteration (flush the packet queue) in less<br>

>> than<br>

>> >> > 200ms we increase linearly the number of updates we take in the next<br>

>> >> > iteration (add 5 to the count) and when we don't make it back in<br>

>> 200ms,<br>

>> >> we<br>

>> >> > drop the number we take quadratically (cut the number in half). In our<br>

>> >> > experiments with large numbers of moving avatars, this algorithm works<br>

>> >> > *very* well. The number of updates taken per iteration stabilizes very<br>

>> >> > quickly and the response time is dramatically improved (no "snap back"<br>

>> on<br>

>> >> > avatars, for example). One difference from the traditional slow<br>

>> start...<br>

>> >> > since the number of "static" items in the queue is very high when a<br>

>> >> client<br>

>> >> > first enters a region, we start with the number of updates taken at<br>

>> 500.<br>

>> >> > that gets the static items out of the queue quickly (and delay doesn't<br>

>> >> > matter as much) and the number taken is generally stable before the<br>

>> >> > login/teleport screen even goes away.<br>

>> >> > * The current prioritization queue can lead to update starvation. The<br>

>> >> > prioritization algorithm dumps all entity updates into a single<br>

>> ordered<br>

>> >> > queue. Lets say you have several hundred avatars moving around in a<br>

>> >> scene.<br>

>> >> > Since we take a limited number of updates from the queue in each<br>

>> >> iteration,<br>

>> >> > we will take only the updates for the "closest" (highest priority)<br>

>> >> avatars.<br>

>> >> > However, since those avatars continue to move, they are re-inserted<br>

>> into<br>

>> >> the<br>

>> >> > priority queue *ahead* of the updates that were already there. So...<br>

>> >> unless<br>

>> >> > the queue can be completely emptied each iteration or the priority of<br>

>> the<br>

>> >> > "distant" (low priority) avatars changes, those avatars will never be<br>

>> >> > updated.<br>

>> >> > Fix: We converted the single priority queue into multiple priority<br>

>> queues<br>

>> >> > and use fair queuing to retrieve updates from each. Here's how it<br>

>> works<br>

>> >> > (more or less)... the current metrics (all of the current<br>

>> prioritization<br>

>> >> > algorithms use distance at some point for prioritization) compute a<br>

>> >> distance<br>

>> >> > from the avatar/camera to an object. We take the log of that distance<br>

>> and<br>

>> >> > use that as the index for the queue where we place the update. So<br>

>> close<br>

>> >> > things go into the highest priority queue and distant things go into<br>

>> the<br>

>> >> > lowest priority queue. Since the area covered by a priority queue<br>

>> grows<br>

>> >> as<br>

>> >> > the square of the radius, the distant (lowest priority queues) will<br>

>> have<br>

>> >> the<br>

>> >> > most objects while the highest priority queues will have a small<br>

>> number<br>

>> >> of<br>

>> >> > objects. Inside each priority queue, we order the updates by the time<br>

>> in<br>

>> >> > which they entered the queue. Then we pull a fixed number of updates<br>

>> from<br>

>> >> > each priority queue each iteration. The result is that local updates<br>

>> get<br>

>> >> a<br>

>> >> > high fraction of the outgoing bandwidth but distant updates are<br>

>> >> guaranteed<br>

>> >> > to get at least "some" of the bandwidth. No starvation. The current<br>

>> >> > prioritization algorithm we implemented is a modification of the "best<br>

>> >> > avatar responsiveness" and "front back" in that we use root prim<br>

>> location<br>

>> >> > for child prims and the priority of updates "in back" of the avatar is<br>

>> >> lower<br>

>> >> > than updates "in front". Our experiments show that the fair queuing<br>

>> does<br>

>> >> > drain the update queue AND continues to provide a disproportionately<br>

>> high<br>

>> >> > percentage of the bw to "close" updates.<br>

>> >> > One other note on this... we should be able to improve the performance<br>

>> of<br>

>> >> > reprioritization with this approach. If we know the distance an avatar<br>

>> >> has<br>

>> >> > moved, we only have to reprioritize objects that might have changed<br>

>> >> priority<br>

>> >> > queues. Haven't implemented this yet but have some ideas for how to do<br>

>> >> it.<br>

>> >> > * The resend queue is evil. When an update packet is sent (they are<br>

>> >> marked<br>

>> >> > reliable) it is moved to a queue to await acknowledgement. If no<br>

>> >> > acknowledgement is received (in time), the packet is retransmitted and<br>

>> >> the<br>

>> >> > wait time is doubled and so on... What that means is that a resend<br>

>> >> packets<br>

>> >> > in a scene that is rapidly changing will often contain updates that<br>

>> are<br>

>> >> > outdated. That is, when we resend the packet, we are just resending<br>

>> old<br>

>> >> data<br>

>> >> > (and if you're having a lot of resends that means you already have a<br>

>> bad<br>

>> >> > connection & now you're filling it up with useless data).<br>

>> >> > Fix: this isn't implemented yet (help would be appreciated)... we<br>

>> think<br>

>> >> that<br>

>> >> > instead of saving packets for resend... a better solution would be to<br>

>> >> keep<br>

>> >> > the entity updates that went into the packet. if we don't receive an<br>

>> ack<br>

>> >> in<br>

>> >> > time, then put the entity updates back into the entity update queue<br>

>> (with<br>

>> >> > entry time from their original enqueuing). That would ensure that we<br>

>> send<br>

>> >> an<br>

>> >> > update for the object & that the data sent is the most recent.<br>

>> >> > * One final note... per client bandwidth throttles seem to work very<br>

>> >> well.<br>

>> >> > however, our experiments with per-simulator throttles was not<br>

>> positive.<br>

>> >> it<br>

>> >> > appeared that a small number of clients was consuming all of the bw<br>

>> >> > available to the simulator and the rest were starved. Haven't looked<br>

>> into<br>

>> >> > this any more.<br>

>> >> ><br>

>> >> > So...<br>

>> >> > Feedback appreciated... there is some logging code (disabled) in the<br>

>> >> branch;<br>

>> >> > real data would be great. And help testing. there are a number of<br>

>> >> > attachment, deletes and so on that i'm not sure work correctly.<br>

>> >> > --mic<br>

>> >> ><br>

>> >> ><br>

>> >> ><br>

>> >> ><br>

>> >> ><br>

>> >> > _______________________________________________<br>

>> >> > Opensim-dev mailing list<br>

>> >> > <a href="mailto:Opensim-dev@lists.berlios.de">Opensim-dev@lists.berlios.de</a><br>

>> >> > <a href="https://lists.berlios.de/mailman/listinfo/opensim-dev" target="_blank">https://lists.berlios.de/mailman/listinfo/opensim-dev</a><br>

>> >> ><br>

>> >> ><br>

>> >> _______________________________________________<br>

>> >> Opensim-dev mailing list<br>

>> >> <a href="mailto:Opensim-dev@lists.berlios.de">Opensim-dev@lists.berlios.de</a><br>

>> >> <a href="https://lists.berlios.de/mailman/listinfo/opensim-dev" target="_blank">https://lists.berlios.de/mailman/listinfo/opensim-dev</a><br>

>> >><br>

>> ><br>

>> ><br>

>> > ------------------------------------------------------------------------<br>

>> ><br>

>> > _______________________________________________<br>

>> > Opensim-dev mailing list<br>

>> > <a href="mailto:Opensim-dev@lists.berlios.de">Opensim-dev@lists.berlios.de</a><br>

>> > <a href="https://lists.berlios.de/mailman/listinfo/opensim-dev" target="_blank">https://lists.berlios.de/mailman/listinfo/opensim-dev</a><br>

>> _______________________________________________<br>

>> Opensim-dev mailing list<br>

>> <a href="mailto:Opensim-dev@lists.berlios.de">Opensim-dev@lists.berlios.de</a><br>

>> <a href="https://lists.berlios.de/mailman/listinfo/opensim-dev" target="_blank">https://lists.berlios.de/mailman/listinfo/opensim-dev</a><br>

>><br>

><br>

><br>

</div></div>> ------------------------------------------------------------------------<br>

<div><div></div><div class="h5">><br>

> _______________________________________________<br>

> Opensim-dev mailing list<br>

> <a href="mailto:Opensim-dev@lists.berlios.de">Opensim-dev@lists.berlios.de</a><br>

> <a href="https://lists.berlios.de/mailman/listinfo/opensim-dev" target="_blank">https://lists.berlios.de/mailman/listinfo/opensim-dev</a><br>

_______________________________________________<br>

Opensim-dev mailing list<br>

<a href="mailto:Opensim-dev@lists.berlios.de">Opensim-dev@lists.berlios.de</a><br>

<a href="https://lists.berlios.de/mailman/listinfo/opensim-dev" target="_blank">https://lists.berlios.de/mailman/listinfo/opensim-dev</a><br>

</div></div></blockquote></div><br>