Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005929opensim[REGION] OpenSim Corepublic2012-03-12 14:272012-03-16 20:46
Reporterjustincc 
Assigned Tojustincc 
PriorityhighSeveritycrashReproducibilitysometimes
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Versionmaster (dev code) 
Target Versionmaster (dev code)Fixed in Version 
Summary0005929: OpenSimulator freezes under heavy object rez/deletion loads
DescriptionOpenSimulator is freezing under a heavy object rez/deletion load. This also occurs on 0.7.2-post-fixes so is likely to be an issue that has been around for some time.
Steps To ReproduceCreating a region with many objects that rez/delete other objects containing scripts on touch. Use pCampbot to launch 10 bots on that sim with the grabbing behaviour.

After some period, either

1) scripts will freeze. The console command "xengine status" will show that the number of 'in use' threads is constantly increasing. On SIGQUIT the dump will contain a large number of the following

"STP SmartThreadPool Thread 0000158" tid=0x0x7ff214013700 this=0x0x7ff1fad76ca8 thread handle 0x75e state : interrupted state owns ()
  at OpenSim.Region.Framework.Scenes.Scene.DeleteSceneObject (OpenSim.Region.Framework.Scenes.SceneObjectGroup,bool) [0x000b0] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/OpenSim/Region/Framework/Scenes/Scene.cs:2036
  at OpenSim.Region.Framework.Scenes.Scene.DeleteSceneObject (OpenSim.Region.Framework.Scenes.SceneObjectGroup,bool) [0x0000d] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/OpenSim/Region/Framework/Scenes/Scene.cs:2004
  at OpenSim.Region.ScriptEngine.Shared.Instance.ScriptInstance.EventProcessor () [0x0033b] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/OpenSim/Region/ScriptEngine/Shared/Instance/ScriptInstance.cs:813
  at OpenSim.Region.ScriptEngine.XEngine.XEngine.ProcessEventHandler (object) [0x0001d] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/OpenSim/Region/ScriptEngine/XEngine/XEngine.cs:1281
  at Amib.Threading.Internal.WorkItem.ExecuteWorkItem () [0x00038] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/ThirdParty/SmartThreadPool/WorkItem.cs:347
  at Amib.Threading.Internal.WorkItem.Execute () [0x00026] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/ThirdParty/SmartThreadPool/WorkItem.cs:298
  at Amib.Threading.SmartThreadPool.ExecuteWorkItem (Amib.Threading.Internal.WorkItem) [0x00011] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/ThirdParty/SmartThreadPool/SmartThreadPool.cs:690
  at Amib.Threading.SmartThreadPool.ProcessQueuedItems () [0x00119] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/ThirdParty/SmartThreadPool/SmartThreadPool.cs:612
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <IL 0x0001c, 0x00051>

This is an illogical place to freeze (the end of a method).

2) scripts and the incoming packet handler both stop. As before, the number of 'in use' threads by xengine will keep increasing. The "show threads" command will show that the incoming packet handler has stopped reporting. A SIGQUIT dump will show entries like

2012-03-12 20:38:41,205 ERROR - OpenSim.Region.ClientStack.LindenUDP.LLUDPServer Recursive read lock can only be aquired in SupportsRecursion mode
System.Threading.LockRecursionException: Recursive read lock can only be aquired in SupportsRecursion mode
  at System.Threading.ReaderWriterLockSlim.TryEnterReadLock (Int32 millisecondsTimeout) [0x00000] in <filename unknown>:0
  at System.Threading.ReaderWriterLockSlim.EnterReadLock () [0x00000] in <filename unknown>:0
  at OpenMetaverse.DoubleDictionary`3[OpenMetaverse.UUID,System.UInt32,OpenSim.Region.Framework.Scenes.EntityBase].TryGetValue (UInt32 key, OpenSim.Region.Framework.Scenes.EntityBase& value) [0x00
000] in <filename unknown>:0
  at OpenSim.Region.Framework.Scenes.EntityManager.TryGetValue (UInt32 key, OpenSim.Region.Framework.Scenes.EntityBase& obj) [0x00000] in /home/justincc/jc/it/v/virtual-environments/specific/secon
d-life/servers/opensim/src/opensim-git/OpenSim/Region/Framework/Scenes/EntityManager.cs:145
  at OpenSim.Region.Framework.Scenes.SceneGraph.GetGroupByPrim (UInt32 localID) [0x00000] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/OpenSi
m/Region/Framework/Scenes/SceneGraph.cs:851
  at OpenSim.Region.Framework.Scenes.SceneGraph.GetSceneObjectPart (UInt32 localID) [0x00000] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/Op
enSim/Region/Framework/Scenes/SceneGraph.cs:1014
  at OpenSim.Region.Framework.Scenes.Scene.GetSceneObjectPart (UInt32 localID) [0x00000] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensim-git/OpenSim
/Region/Framework/Scenes/Scene.cs:4294
  at OpenSim.Region.Framework.Scenes.Scene.ProcessObjectGrab (UInt32 localID, Vector3 offsetPos, IClientAPI remoteClient, System.Collections.Generic.List`1 surfaceArgs) [0x00000] in <filename unkn
own>:0
  at OpenSim.Region.ClientStack.LindenUDP.LLClientView.HandleObjectGrab (IClientAPI sender, OpenMetaverse.Packets.Packet Pack) [0x000ef] in /home/justincc/jc/it/v/virtual-environments/specific/sec
ond-life/servers/opensim/src/opensim-git/OpenSim/Region/ClientStack/Linden/UDP/LLClientView.cs:6966
  at OpenSim.Region.ClientStack.LindenUDP.LLClientView.ProcessPacketMethod (OpenMetaverse.Packets.Packet packet) [0x0004e] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/serve
rs/opensim/src/opensim-git/OpenSim/Region/ClientStack/Linden/UDP/LLClientView.cs:654
  at OpenSim.Region.ClientStack.LindenUDP.LLClientView.ProcessInPacket (OpenMetaverse.Packets.Packet packet) [0x00107] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/o
pensim/src/opensim-git/OpenSim/Region/ClientStack/Linden/UDP/LLClientView.cs:11733
  at OpenSim.Region.ClientStack.LindenUDP.LLUDPServer.ProcessInPacket (System.Object state) [0x0004a] in /home/justincc/jc/it/v/virtual-environments/specific/second-life/servers/opensim/src/opensi
m-git/OpenSim/Region/ClientStack/Linden/UDP/LLUDPServer.cs:1336

Where all threads are waiting for ReaderWriterLockSlim but no thread actually has it (ignore the "Recursive read lock..." message, this may well be related to the problem but I think it's chiefly an artifact of the mono thread dump.
TagsNo tags attached.
Git Revision or version number25592bb
Run ModeStandalone (1 Region)
Physics EngineODE
EnvironmentMono / Linux64
Mono Version2.6.3
Viewer
Attached Files? file icon OpenMetaverseTypes.dll [^] (105,984 bytes) 2012-03-12 14:28
patch file icon jc.no-stop-thread-abort.patch [^] (622 bytes) 2012-03-13 17:35 [Show Content]

- Relationships

-  Notes
(0021083)
justincc (administrator)
2012-03-12 14:30

The attached rebuilt OpenMetaverseTypes.dll may help. It allow threads to recursively lock ReaderWriterLockSlim. In principle, this should have no impact but it hasn't failed in tests so far though I expect it to.

This DLL also unlocks ReaderWriterLockSlim in exception catch statements as well as finally. However, this has already been shown to have no impact by itself (the failures still occur).
(0021093)
justincc (administrator)
2012-03-13 17:37

Further testing reveals that the updated OpenMetaverseTypes.dll does nothing.

The jc.no-stop-thread-abort.patch temporarily comments out thread aborting where scripts do not stop in a timely fashion and there is an event waiting of in progress. This is certainly not a permament solution but does resolve the current problem.

The current problem is most probably connected with having multiple scripts in the dying linkset.
(0021121)
justincc (administrator)
2012-03-16 20:46

Somewhat resolved by git master 2f81e53, though llDie() could still cause problems in linksets containing multiple scripts where those other scripts have long running events (more than 1 second of execution time).

So ultimately there has to be a better solution. Maybe even an improvement to mono itself since the root cause of this bug appears to be ThreadAbortExceptions not releasing their locks properly due to premature exit from finally sections, which should not be occurring according to the .net spec. This has been observed on mono 2.6.7 and in the 2.10 series.

- Issue History
Date Modified Username Field Change
2012-03-12 14:27 justincc New Issue
2012-03-12 14:27 justincc Status new => assigned
2012-03-12 14:27 justincc Assigned To => justincc
2012-03-12 14:28 justincc File Added: OpenMetaverseTypes.dll
2012-03-12 14:30 justincc Note Added: 0021083
2012-03-13 17:35 justincc File Added: jc.no-stop-thread-abort.patch
2012-03-13 17:37 justincc Note Added: 0021093
2012-03-16 20:46 justincc Note Added: 0021121
2012-03-16 20:46 justincc Status assigned => closed
2012-03-16 20:46 justincc Resolution open => fixed


Copyright © 2000 - 2012 MantisBT Group
Powered by Mantis Bugtracker