<div dir="ltr"><div><div>I agree the present use of threads is far from optimal but I believe it to be more a matter of convenience which was overused, partially due to the origins of the codebase being from many diverse contributors with diverse goals and a naive, but evolving, initial design. Such is probably unavoidable in open source programs of this size.<br>

<br></div>However, threadpool threads are not necessarily as inefficient as some of the messages in this thread might imply. They end up using very few operating system threads and share these among many tasks and do context switching in user space rather than requiring kernel traps. This reduces the normal cost s associated with using threads and can provide an efficient  means of multitasking within a process if used correctly.<br>

<br></div>A discussion of recommend ways of performing asynchronous IO with a threadpool can be found here: <a href="http://msdn.microsoft.com/en-us/library/ms973903.aspx#threadpool_topic6">http://msdn.microsoft.com/en-us/library/ms973903.aspx#threadpool_topic6</a><br>

</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Apr 28, 2014 at 4:49 PM, Matt Lehmann <span dir="ltr"><<a href="mailto:mattlehma@gmail.com" target="_blank">mattlehma@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr">Thanks so much for the explanation Justin. </p>

<p dir="ltr">I agree that Networking issues like this are hard to measure. </p>

<p dir="ltr">I guess my concern is that,  after reading alot of the opensim code,  there seems to be a prevalent notion that concurrency will solve issues with responsiveness.   It's treated like a magic box where you can 'fireandforget.' </p>


<p dir="ltr">If a computation is going to lag your system,  then it is going to lag your system whether you do it now or later.   What's more is that all the spawning of threads will definitely cause its own problems. </p>


<p dir="ltr">I do agree though that a database call might be made in its own thread,  so that the cpu can do other things while the disk is spinning. In this case it might be prudent to implement a more complicated scheme involving cached database management. </p>


<p dir="ltr">Anyways,  thanks so much for the work you all do on opensim.   It's a great system. </p><div class="HOEnZb"><div class="h5">

<div class="gmail_quote">On Apr 28, 2014 4:01 PM, "Justin Clark-Casey" <<a href="mailto:jjustincc@googlemail.com" target="_blank">jjustincc@googlemail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


That's correct.  I added some high level summary to [1].<br>

<br>

My expectation is that in many situations, the async packets are indeed handled pretty quickly with very often only a single thread from the pool used.  I've seen this when analyzing threadpool data from the crude stats recording mechanism [2].<br>


<br>

However, blocking IO can indeed be a problem, often due to heavy load on services.  For instance, it used to be the case that some asset fetches were performed asynchronously, so if the asset service was slow, processing triggered by inbound packet handling would be held up.<br>


<br>

If one were running the two tier system that you described, then this would hold up processing of all second tier packets more than the current system.  Perhaps one could forensically identify which handlers could be subject to this problem and handle those differently from others.<br>


<br>

It's a complex topic as OpenSimulator is very much an evolved (and evolving codebase), where delays can occur in unexpected places and there's a huge variance in network and hardware conditions.  In this case, I can imagine that our async handling is more resilient in cases where only a few requests are slow.<br>


<br>

I believe the key is trying to measure the performance change of a second-tier loop to see if the potential gains are worth the potential problems.<br>

<br>

[1] <a href="http://opensimulator.org/wiki/LLUDP_ClientStack#Inbound_UDP" target="_blank">http://opensimulator.org/wiki/<u></u>LLUDP_ClientStack#Inbound_UDP</a><br>

[2] <a href="http://opensimulator.org/wiki/Show_stats#stats_record" target="_blank">http://opensimulator.org/wiki/<u></u>Show_stats#stats_record</a><br>

<br>

On 26/04/14 13:53, Matt Lehmann wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I looked at this a bit more this morning.<br>

<br>

So, as I understand, the handling looks like this--><br>

<br>

--async read cycle which drops packet into a blocking queue<br>

--smart thread which services the blocking queue, and calls the LLClientView method ProcessInPacket<br>

<br>

LLClientView sorts the packets according to whether the handler should be called asynchronously or not.<br>

<br>

If async is needed, LLClientView will create a smart thread for the handler, and start the thread.<br>

...the handlers basically signal the events defined in LLClientView which are listened to by one or more other callbacks.<br>

If async is not needed/desired, then LLClientView will process the packet directly.<br>

<br>

So there is one additional thread being created for each async handler, with the original smart thread running all the<br>

non-async packet handlers.<br>

<br>

The question is /can these async threads can be replaced by a second smart thread, which services a queue of async<br>

handlers/?  Do the handlers require some sort of blocking I/O?  Can we rearrange the handlers to operate under these<br>

conditions?<br>

<br>

If the answer is yes, then a great deal of compute cycles can be saved by consolidating all the spawned threads into one<br>

single thread loop.<br>

<br>

Matt<br>

<br>

<br>

<br>

<br>

On Fri, Apr 25, 2014 at 10:03 PM, Matt Lehmann <<a href="mailto:mattlehma@gmail.com" target="_blank">mattlehma@gmail.com</a> <mailto:<a href="mailto:mattlehma@gmail.com" target="_blank">mattlehma@gmail.com</a>>> wrote:<br>


<br>

    Yes I agree that the udp service is critical and would need extensive testing.<br>

<br>

    I wouldn't expect you all to make any changes.<br>

<br>

    Still it's an interesting topic.  The networking world seems to be moving towards smaller virtualized servers with<br>

    less resources, so I think it's an important discussion.  At my work we are deploying an opensim cluster which is<br>

    why I have become so interested.<br>

<br>

<br>

    Thanks<br>

<br>

    Matt<br>

<br>

<br>

    On Friday, April 25, 2014, Diva Canto <<a href="mailto:diva@metaverseink.com" target="_blank">diva@metaverseink.com</a> <mailto:<a href="mailto:diva@metaverseink.com" target="_blank">diva@metaverseink.com</a>><u></u>> wrote:<br>


<br>

        That is one very specific and unique case, something that happens in the beginning, and that is necessary,<br>

        otherwise clients crash. It's an "exception" wrt the bulk of processing UDP packets. The bulk of them are<br>

        processed as you described in your first message: placed in a queue, consumed by a consumer thread which either<br>

        processes them directly or spawns threads for processing them.<br>

<br>

        In general, my experience is also that limiting the amount of concurrency is a Good Thing. A couple of years ago<br>

        we had way too much concurrency; we've been taming that down.<br>

<br>

        As Dahlia said, the packet handling layer of OpenSim is really critical, and the viewers are sensitive to it, so<br>

        any drastic changes to it need to go through extensive testing. The current async reading is not bad, as it<br>

        empties the socket queue almost immediately. The threads that are spawn from the consumer thread, though, could<br>

        use some rethinking.<br>

<br>

        On 4/25/2014 9:29 PM, Matt Lehmann wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

        One example of what I'm trying to say.<br>

<br>

        In part of the packet handling there is a condition where the server needs to respond to the client, but does<br>

        not yet know the identity of the client. So the server responds to the client and then spawns a thread which<br>

        loops and sleeps until it can identify the client.( I don't really understand what's going on here,)<br>

<br>

        Nevertheless in this case you could do without the new thread if you queued a lambda function which would<br>

        check to see if the client can be identified.  A second event loop could periodically poll this function until<br>

        it completes.<br>

<br>

        You could also queue other contexts which would complete the handling of other types of packets.<br>

<br>

        Matt<br>

<br>

        On Friday, April 25, 2014, Dahlia Trimble <<a href="mailto:dahliatrimble@gmail.com" target="_blank">dahliatrimble@gmail.com</a>> wrote:<br>

<br>

            From my experience there are some things that need to happen as soon as possible and others which can be<br>

            delayed. What needs to happen ASAP:<br>

            1). reading the socket and keeping it emptied.<br>

            2) acknowledge any received packets which may require such<br>

            3) process any acknowledgements sent by the viewer<br>

            4) handle AgentUpdate packets. (these can probably be filtered for uniqueness and mostly discarded if not<br>

            unique).<br>

<br>

            This list is off the top of my head and may not be complete. Most, if not all, other packets could be put<br>

            into queues and process as resources permit without negatively affecting the quality of the shared state<br>

            of the simulation.<br>

<br>

            Please be aware that viewers running on high-end machines can constantly send several hundred packets per<br>

            second, and that under extreme conditions there can be several hundred viewers connected to a single<br>

            simulator.  Any improvements in the UDP processing portions of the code base should probably take these<br>

            constraints into consideration.<br>

<br>

<br>

            On Fri, Apr 25, 2014 at 8:17 PM, Matt Lehmann <<a href="mailto:mattlehma@gmail.com" target="_blank">mattlehma@gmail.com</a>> wrote:<br>

<br>

                That makes sense to me.<br>

<br>

                If I recall, the packet handlers will create more threads if they expect delays, such as when waiting<br>

                for a client to finish movement into the sim.<br>

<br>

                Considering that I have 65 threads running on my standalone instance, with 4 cores that leaves about<br>

                15 threads competing.  You have to do the work at some point.<br>

<br>

                Matt<br>

<br>

                On Friday, April 25, 2014, Dahlia Trimble <<a href="mailto:dahliatrimble@gmail.com" target="_blank">dahliatrimble@gmail.com</a>> wrote:<br>

<br>

                    Depends on what you mean by "services the packets". Decoding and ACKing could probably work well<br>

                    in a socket read loop but dispatching the packet to the proper part of the simulation could incur<br>

                    many delays which can cause a lot of packet loss in the lower level operating system routines as<br>

                    the buffers are only so large and any excessive data is discarded. Putting them in a queue<br>

</blockquote>

<br>

<br>

<br>

<br>

______________________________<u></u>_________________<br>

Opensim-dev mailing list<br>

<a href="mailto:Opensim-dev@opensimulator.org" target="_blank">Opensim-dev@opensimulator.org</a><br>

<a href="http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev" target="_blank">http://opensimulator.org/cgi-<u></u>bin/mailman/listinfo/opensim-<u></u>dev</a><br>

<br>

</blockquote>

<br>

<br>

-- <br>

Justin Clark-Casey (justincc)<br>

OSVW Consulting<br>

<a href="http://justincc.org" target="_blank">http://justincc.org</a><br>

<a href="http://twitter.com/justincc" target="_blank">http://twitter.com/justincc</a><br>

______________________________<u></u>_________________<br>

Opensim-dev mailing list<br>

<a href="mailto:Opensim-dev@opensimulator.org" target="_blank">Opensim-dev@opensimulator.org</a><br>

<a href="http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev" target="_blank">http://opensimulator.org/cgi-<u></u>bin/mailman/listinfo/opensim-<u></u>dev</a><br>

</blockquote></div>

</div></div><br>_______________________________________________<br>

Opensim-dev mailing list<br>

<a href="mailto:Opensim-dev@opensimulator.org">Opensim-dev@opensimulator.org</a><br>

<a href="http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev" target="_blank">http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev</a><br>

<br></blockquote></div><br></div>