[Opensim-dev] Lockless Lists?

Wed Dec 3 16:52:25 UTC 2008

> -----Original Message-----
> From: opensim-dev-bounces at lists.berlios.de [mailto:opensim-dev- 
> bounces at lists.berlios.de] On Behalf Of Mariusz Nowostawski
> Sent: Wednesday, December 03, 2008 2:14 AM
> To: opensim-dev at lists.berlios.de
> Subject: Re: [Opensim-dev] Lockless Lists?
> 
> Dr Scofield wrote:
>> Christopher Yeoh wrote:
>> 
>>> On Mon, 24 Nov 2008 20:43:09 +0100
>>> "Homer Horwitz" <homerhorwitz at googlemail.com> wrote:
>>> 
>>>> So, I'm not sure if we really should do that move. If at all, I'm for
>>>> a very slow move to lock-free versions from a rather stable software
>>>> base (which we currently don't have in trunk), so errors that are
>>>> introduced during that move are more easily identifiable, with much
>>>> testing in-between. Even then, I'm absolutely sure we will get a lot
>>>> of Heisenbugs in the process, which will take us weeks to find.
>>>> 
>>>> 
>>> Perhaps what is really needed here is some performance benchmarks
>>> which highlight existing problems? So for individual changes to
>>> lockless versions we better see what improvements we'd get on both
>>> small and large SMP machines and whether its worth the increase in
>>> complexity.
>>> 
>>> I did some debugging of a deadlock a couple of weeks ago and found it
>>> already pretty complicated. Any suggestions on how other people
>>> approach these problems with OpenSim? I sprinkled lots of console
>>> messages around as mdb doesn't seem to work for me, but in retrospect
>>> it would have been really handy to have been able to just turn on a
>>> lock debugging flag and have debug output when locks are taken and
>>> released.
>>> 
>>  in theory, you can trace specific instructions with mono...in theory,
>> because i haven't succeeded yet in getting this really done without
>> being drowned in console messages (sean might have more experience with
>> this).
>> 
>> i agree console messages are not really the cat's whiskers and the
>> additional problem with console messages is that it will change timing
>> and might get you nowhere..
> 
> 
> Hi,
> 
> I've been following this discussion with interest.
> 
> Would it be possible for someone with a good understanding of opensim
> architecture provide a schematic description of where the threads are
> being created, for what purpose and how much inter-thread dependencies
> are there, please? In particular, what are the shared datastructures
> and the like. Digging it all from the code is possible but if some has
> it in her/his head it would speed up the analysis.

OpenSim.Region.Environment.dll has something like 350 lock statements by itself, and there are heavy cross-dependencies between many modules. A schematic of the entire threading model might actually be less readable than digging through the source code.

> 
> We've done some testing of different software packages on multicore
> systems, and generally for the best performance one needs to plan the
> threading model, thread pooling and thread syncing explicitly, without
> leaving much to the runtime and operating systems. What I mean here is
> that multi-threaded applications with "normal" threads and mutexes can
> achieve better performance to a single threaded ones in many cases -
> however, applications that plan and work with custom threading usually
> outperform the latter. We are investigating some areas of improving
> the code, and here is a list of some "hints" that may trigger some
> further discussions.
> 
> - mutexes should be removed and lockless datastructures used, but,
> only where it make sense. In some cases having a lock is unavoidable.
> Suggested removal of normal blocking locks with spin-locks will not
> help with performance at all - to the contrary - on 2 core machine one
> core will be busy spin-locking waiting for the lock to become
> available instead of doing something useful in the meantime
> 
> - many thread usually decrease performance. Having more than 2-4
> threads on 2-core machine will decrease overall performance. Given
> that most opensim deployments are on 2- or 4-core chips, having 10s or
> more threads is not helping in terms performance. Rather to the contrary.
> 
> - creation and destruction of threads is quite costly. Having thread
> pools instead would help.

Creation and destruction of System.Threading.Thread objects is very costly, but on the other hand IOCP threads are very fast and lightweight. .NET 3.5 SP1 improves on these even further, and some IOCP threads and ran as microthreads at the CLR level. That's why there is no 1:1 mapping of .NET threads to system threads. .NET is highly optimized for the asynchronous model; using Begin* and End* calls wherever possible and spending minimal time in asynchronous callbacks (offloading heavy or long-running tasks to System.Threading.ThreadPool worker threads) will go a long ways. Of course this is all moot if you deadlock somewhere or are bottlenecking on a slow lock.

> 
> - scheduling events and packet processing subject to priorities would
> definitely help, and I'm quite surprised that this has not been
> investigated more thoroughly. Servers need to deal with increasing
> number of events and network traffic. Having a well-designed priority
> mechanisms - not on the thread level, but rather on the event type and
> packet type level would help in managing responsiveness and general
> perceived performance of the server.
> 
> 
> Note - I'm not as familiar with certain opensim internals as many of
> you are and I would appreciate some guidance.
> 
> best regards
> Mariusz
> _______________________________________________
> Opensim-dev mailing list
> Opensim-dev at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/opensim-dev