[Opensim-dev] Issues with the Simulator under high load
Akira Sonoda
akira.sonoda.1 at gmail.com
Tue Apr 24 10:36:06 UTC 2012
:-) just to be clear I am talking about git hash 007711 ( opensim 0.7.4
recommended in the osgrid twitter at the first of march 2012 ). I've seen a
newer recommended version just came out ... and I'll give that one a try.
Thank you for supporting my thought about a deadlock! I'll have a look at
the 700 MB of data I just have and try to go deeper into the analysis. I
expect the YourKit profiler to be helpful in this regard.
Regards,
Akira
Am 24. April 2012 02:45 schrieb Justin Clark-Casey <jjustincc at googlemail.com
>:
> Those kind of traces (where lots of threads are waiting for
> WaitHandler.WaitOne() or similar) are usually indicative of deadlock
> somewhere in the system.
>
> When this happens, I find that the best thing to do is take a vm thread
> dump and inspect the code to find out where the deadlock is happening. I
> don't know how to do this on Windows but I'm sure it's possible. Windows
> may well even provide some nice tools to identify the deadlock place rather
> than eyeballing the stacks.
>
> But to be honest, if that's opensim 0.7.1.1 then the code has changed a
> lot since then. There's a very good chance any problems you find will not
> apply to OpenSim 0.7.3 or even 0.7.2. I suggest you upgrade before doing
> any other analysis. Even OpenSim 0.7.3.1 will quickly become out of date -
> if you really want to winkle out bugs then mainline is often the place to
> be (though you could also try opensim-0.7.3-extended git branch which is
> 0.7.3.1 + selected things from git master - should be stable but not
> absolutely guaranteed.
>
> Just to be clear, I'm happy to eyeball problems on recent releases or
> master but I lack the time to do thorough analysis. However, I'm happy to
> add more stats/diagnostic commands to OpenSimulator if useful stuff can be
> identified and it's not super-difficult to do.
>
> Best,
>
> Justin
>
>
>
> On 22/04/12 23:43, Akira Sonoda wrote:
>
>> Hi Justin,
>>
>> I am still investigating the Slow GetTexture problem and the resulting
>> instability under high Load.
>>
>> What i did so far:
>>
>> 1. I'm still using opensim-0007711 ( i didn't have the time to upgrade,
>> the first upgrade was not so good due an error
>>
>> on my side and since then i did not look too much into it)
>> 2. I've created a Windows Instance in the Amazon cloud in order to be
>> able to connect some profiling tools.
>> 3. I've run the last two Friday Parties from there the first Party was
>> quite okay ( MaxPoolThreads=90 in the
>>
>> SmartPoolThreads settings, but i saw more on peaks, strange )
>> 4. The second party from the 20. April went crazy after 3 hours here's a
>> picture:
>>
>>
>> http://farm9.staticflickr.com/**8017/6957771466_4412ee83c4_b_**d.jpg<http://farm9.staticflickr.com/8017/6957771466_4412ee83c4_b_d.jpg>
>>
>> Most of the threads had a stack trace like this:
>>
>> http://farm9.staticflickr.com/**8155/6957875232_0203631ed0_b_**d.jpg<http://farm9.staticflickr.com/8155/6957875232_0203631ed0_b_d.jpg>
>>
>> Wondering why this increase started after approx 3 hours. We had at max
>> about 18 avies on the region/sim with various
>> different viewers. Because I did not attach this amazon Cloud instance to
>> my splunk server i have no statistics about
>> the viewers during the party ... i probably should do that in future.
>>
>> I will upgrade to a more recent version next week ...
>>
>> Thanks a lot!
>>
>> Akira
>>
>>
>> Am 19. März 2012 01:57 schrieb Justin Clark-Casey <
>> jjustincc at googlemail.com <mailto:jjustincc at googlemail.**com<jjustincc at googlemail.com>
>> >>:
>>
>>
>> It's quite possible the 3rd party HTTP server doesn't use the
>> threadpool though I've never looked in detail.
>>
>> You could supply any other http address in the GetTexture cap (e.g.
>> the asset service directly with a suitable
>> handler). However, I'm not sure that asset serving is such a
>> bottleneck at the moment compared with scripting and
>> physics issues.
>>
>>
>> On 17/03/12 21:10, Dahlia Trimble wrote:
>>
>> I've done a bit of tracing through the code and I can't seem to
>> find where the http server in OpenSimulator uses
>> threadpool threads. I did find them used in the LLUDP server and
>> in asyncronous requests from the asset service,
>> but I
>> have yet to find any other uses. is it possible that the http
>> server is still using system threads, bypassing the
>> threadpool? I'm rather curious as I use the built-in http server
>> in a few personal applications and I'm
>> concerned about
>> performance.
>>
>> On another note, I believe part of the impetus behind LL designing
>> the texture fetch capability was to allow a
>> separate
>> service from the simulator to supply assets to viewers, thereby
>> reducing load on the individual simulator processes.
>> Perhaps this is something OpenSimulator can take advantage of?
>> Probably some kind of asset proxy cache could do
>> a much
>> better job of serving textures and other assets to viewers than
>> the existing monolithic process? I believe it
>> could even
>> be moved to a separate server with a different IP address.
>>
>>
>> On Fri, Mar 16, 2012 at 8:36 PM, Justin Clark-Casey <
>> jjustincc at googlemail.com <mailto:jjustincc at googlemail.**com<jjustincc at googlemail.com>
>> >
>> <mailto:jjustincc at googlemail._**_com <mailto:jjustincc at googlemail.
>> **com <jjustincc at googlemail.com>>>> wrote:
>>
>> Hi Akira. I have now updated the "show threads" method to
>> show threadpool statistics for the main threadpool.
>> Please note that each XEngine script engine will also have
>> it's own threadpool (which can be seen using the
>> "xengine status" command). Might need to improve this further.
>>
>> "show threads" will also show all in-use threads. However, at
>> least on mono 2.6.7 this isn't reported by the VM so
>> won't be shown.
>>
>> I'm not sure whether this will help you or not in tracking
>> down performance issues. In some situations it could
>> help (e.g. if threads are encountering deadlock the number of
>> 'in use' threads will leap up, though you've
>> probably
>> already noticed deadlock by the long-running threads reporting
>> monitoring failures and the sim locking up).
>>
>> So I'd be happy to hear suggestions for additional data and
>> I'll implement them if I can, since I think this is
>> going to be a growing area of concern. Unfortunately pinning
>> down performance issues with a system as
>> complex as
>> OpenSimulator (with massive numbers of threads and user
>> generated scripts) is likely to remain a significant
>> challenge for the forseeable future.
>>
>>
>> On 11/03/12 19:15, Akira Sonoda wrote:
>>
>> Am 10. März 2012 03:25 schrieb Justin Clark-Casey <
>> jjustincc at googlemail.com
>> <mailto:jjustincc at googlemail.**com <jjustincc at googlemail.com>>
>> <mailto:jjustincc at googlemail._**_com <mailto:jjustincc at googlemail.**com<jjustincc at googlemail.com>
>> >>
>> <mailto:jjustincc at googlemail. <mailto:jjustincc at googlemail.>**____com
>> <mailto:jjustincc at googlemail._**_com
>>
>> <mailto:jjustincc at googlemail.**com <jjustincc at googlemail.com>>>>>:
>>
>>
>>
>> I'm sorry to say that you'll have to take the
>> ThreadPool numbers with a very very very large pinch
>> of salt. I
>> believe they only refer to the built-in mono thread
>> pool and not the SmartThreadPool which is the one
>> actually used
>> (and beyond that the core simulator and xengine use
>> separate pools). I will try and improve this
>> situation
>> soon.
>>
>>
>> Thank you Justin!
>>
>> Would be nice to have some meaningful statistics for all
>> those ThreadPools! Maybe there is a possibility to
>> write those
>> statistics to the log from time to time ( e.g. every 30
>> seconds). Together with some documented "best
>> practices"
>> from
>> those who operate Sims, with lots of avatars on it - I'm
>> thinking mainly the OSgrid Plazas are good
>> references -
>> this
>> could be highly valuable information for those who operate
>> Sims for similar purposes ( meeting points,
>> parties,
>> concerts
>> etc. )
>>
>>
>>
>>
>>
>>
>> ______________________________**_____________________
>>
>> Opensim-dev mailing list
>> Opensim-dev at lists.berlios.de <mailto:Opensim-dev at lists.**
>> berlios.de <Opensim-dev at lists.berlios.de>> <mailto:Opensim-dev at lists.__be
>> **rlios.de <http://berlios.de>
>> <mailto:Opensim-dev at lists.**berlios.de<Opensim-dev at lists.berlios.de>
>> >>
>> https://lists.berlios.de/____**mailman/listinfo/opensim-dev<https://lists.berlios.de/____mailman/listinfo/opensim-dev>
>> <https://lists.berlios.de/__**mailman/listinfo/opensim-dev<https://lists.berlios.de/__mailman/listinfo/opensim-dev>
>> >
>>
>> <https://lists.berlios.de/__**mailman/listinfo/opensim-dev<https://lists.berlios.de/__mailman/listinfo/opensim-dev><
>> https://lists.berlios.de/**mailman/listinfo/opensim-dev<https://lists.berlios.de/mailman/listinfo/opensim-dev>
>> >>
>>
>>
>>
>>
>> --
>> Justin Clark-Casey (justincc)
>> http://justincc.org/blog
>> http://twitter.com/justincc
>> ______________________________**_____________________
>>
>> Opensim-dev mailing list
>> Opensim-dev at lists.berlios.de <mailto:Opensim-dev at lists.**
>> berlios.de <Opensim-dev at lists.berlios.de>> <mailto:Opensim-dev at lists.__be
>> **rlios.de <http://berlios.de>
>> <mailto:Opensim-dev at lists.**berlios.de<Opensim-dev at lists.berlios.de>
>> >>
>> https://lists.berlios.de/____**mailman/listinfo/opensim-dev<https://lists.berlios.de/____mailman/listinfo/opensim-dev>
>> <https://lists.berlios.de/__**mailman/listinfo/opensim-dev<https://lists.berlios.de/__mailman/listinfo/opensim-dev>
>> >
>> <https://lists.berlios.de/__**mailman/listinfo/opensim-dev<https://lists.berlios.de/__mailman/listinfo/opensim-dev><
>> https://lists.berlios.de/**mailman/listinfo/opensim-dev<https://lists.berlios.de/mailman/listinfo/opensim-dev>
>> >>
>>
>>
>>
>>
>>
>>
>> ______________________________**___________________
>> Opensim-dev mailing list
>> Opensim-dev at lists.berlios.de <mailto:Opensim-dev at lists.**
>> berlios.de <Opensim-dev at lists.berlios.de>>
>> https://lists.berlios.de/__**mailman/listinfo/opensim-dev<https://lists.berlios.de/__mailman/listinfo/opensim-dev><
>> https://lists.berlios.de/**mailman/listinfo/opensim-dev<https://lists.berlios.de/mailman/listinfo/opensim-dev>
>> >
>>
>>
>>
>> --
>> Justin Clark-Casey (justincc)
>> http://justincc.org/blog
>> http://twitter.com/justincc
>> ______________________________**___________________
>> Opensim-dev mailing list
>> Opensim-dev at lists.berlios.de <mailto:Opensim-dev at lists.**berlios.de<Opensim-dev at lists.berlios.de>
>> >
>> https://lists.berlios.de/__**mailman/listinfo/opensim-dev<https://lists.berlios.de/__mailman/listinfo/opensim-dev><
>> https://lists.berlios.de/**mailman/listinfo/opensim-dev<https://lists.berlios.de/mailman/listinfo/opensim-dev>
>> >
>>
>>
>>
>>
>> ______________________________**_________________
>> Opensim-dev mailing list
>> Opensim-dev at lists.berlios.de
>> https://lists.berlios.de/**mailman/listinfo/opensim-dev<https://lists.berlios.de/mailman/listinfo/opensim-dev>
>>
>
>
> --
> Justin Clark-Casey (justincc)
> http://justincc.org/blog
> http://twitter.com/justincc
> ______________________________**_________________
> Opensim-dev mailing list
> Opensim-dev at lists.berlios.de
> https://lists.berlios.de/**mailman/listinfo/opensim-dev<https://lists.berlios.de/mailman/listinfo/opensim-dev>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://opensimulator.org/pipermail/opensim-dev/attachments/20120424/961d6e38/attachment-0001.html>
More information about the Opensim-dev
mailing list