Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007888opensim[REGION] Specific OpenSim Modulepublic2016-04-22 13:182017-07-31 16:41
Reportercolosi 
Assigned Tocolosi 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformMacOSOS X YosemiteOS Version10.10.5
Product Versionmaster (dev code) 
Target VersionFixed in Version 
Summary0007888: Mesh objects not rendering due to thread safety issue in SimulatorFeaturesHelper.ShouldSend(AgentID) func
DescriptionMultiple requests to retrieve the simulator features can be initiated at the same time on multiple threads. Modules appending to the features are instructed to use the SimulatorFeaturesHelper.ShouldSend() func to determine when they should act. This uses a static cache which every thread accesses. When multiple threads access it at the same time, it can delay processing or cause an infinite loop which never finishes. It is unclear when this impacts mesh rendering, but it appears that the existence of mesh objects creates many additional calls to OnSimulatorFeaturesRequest at login.
Steps To ReproduceThis symptom appeared to a user testing the GloebitMoneyModule beta with mesh on his regions. It is unclear why this combination revealed this underlying issue which was not noticed previously. To reproduce, install the GMM and enable it. Place mesh objects on your region. Log into that region and the mesh objects will not be visible. If you check the logs, you will see.

This was originally reported by someone running a public grid on 0.8.2.1 release (497f575b9ba041f271ab2738344d016e5dad0c3e) and it has been reproduced on master dev, commit 7d8b783d31b3eeab70ce6f29228a0c842cf4c909.

I have seen this using FireStorm. It was also reported as occurring in Singularity.
Additional InformationWhile a single OnSimulatorFeaturesRequest trigger runs on a single thread, and to the best of my knowledge, sequentially, multiple calls to this event at the same time will run in parallel on separate threads. The SimulatorFeaturesHelper class uses a static cache (AgentID: (RegionID: bool)) which every thread hits. When multiple threads attempt to access this at the same time (most likely when two threads attempt to add to it at the same time), this causes delays in processing of the event, and I believe infinite loops.

Occassionally we get a crash specifically at line 134 in SFH — rsendlist.Add(rsend); -- I have seen this in logs from 0.8.2.1 release, but have not yet reproduced it in dev master. When it does, it produces the following stack trace:
2016-04-21 06:34:11,354 ERROR (Threadpool worker) - OpenSim.Framework.Servers.HttpServer.BaseHttpServer [BASE HTTP SERVER]: HandleRequest() threw exception
System.IndexOutOfRangeException: Index was outside the bounds of the array.
  at System.Collections.Generic.List`1[T].Add (System.Collections.Generic.T item) <0x4699e0e0 + 0x0004f> in <filename unknown>:0
  at OpenSim.Region.OptionalModules.ViewerSupport.SimulatorFeaturesHelper.ShouldSend (UUID agentID) <0x46920520 + 0x003d6> in <filename unknown>:0
  at Gloebit.GloebitMoneyModule.GloebitMoneyModule+<RegionLoaded>c__AnonStorey0.<>m__0 (UUID agentID, OpenMetaverse.StructuredData.OSDMap& features) <0x469202e0 + 0x0003b> in <filename unknown>:0
  at (wrapper delegate-invoke) <Module>:invoke_void_UUID_OSDMap& (OpenMetaverse.UUID,OpenMetaverse.StructuredData.OSDMap&)
  at OpenSim.Region.ClientStack.Linden.SimulatorFeaturesModule.HandleSimulatorFeaturesRequest (System.Collections.Hashtable mDhttpMethod, UUID agentID) <0x4691fa40 + 0x001d9> in <filename unknown>:0
  at OpenSim.Region.ClientStack.Linden.SimulatorFeaturesModule+<RegisterCaps>c__AnonStorey4.<>m__7 (System.Collections.Hashtable x) <0x4691f9e0 + 0x0003f> in <filename unknown>:0
  at OpenSim.Framework.Servers.HttpServer.RestHTTPHandler.Handle (System.String path, System.Collections.Hashtable request) <0x4691f960 + 0x00063> in <filename unknown>:0
  at OpenSim.Framework.Servers.HttpServer.BaseHttpServer.HandleRequest (OpenSim.Framework.Servers.HttpServer.OSHttpRequest request, OpenSim.Framework.Servers.HttpServer.OSHttpResponse response) <0x46851730 + 0x009b8> in <filename unknown>:0

We have seen the following log alerts every time: 18:51:19 - [LOGHTTP]: Slow handling of 33 GET /CAPS/9830684a-9ab1-46a9-990a-74b23da91f8b SimulatorFeatures 4a9c0fc4-00da-4b6f-812d-db21f241a639 from 10.7.105.6:64885 took 10840ms

Sometimes, upon exiting the process, we will see the following errors in the log: 19:26:27 - [WATCHDOG]: Timeout detected for thread "MapBlockSendThread (NormalTown)". ThreadState=Stopped. Last tick was 552281ms ago.

TagsNo tags attached.
Git Revision or version number7d8b783d31b3eeab70ce6f29228a0c842cf4c909
Run Mode Standalone (Multiple Regions)
Physics EngineBasicPhysics
EnvironmentMono / OSX
Mono VersionOther
ViewerFireStorm
Attached Files

- Relationships

-  Notes
(0030208)
colosi (reporter)
2016-04-23 18:27

Made some new discoveries. The SimulatorFeaturesModule seems like it is torn between a shared and non-shared module. It is non-shared, but rather than keep a list of scenes, it only stores the last added scene, and instead of keeping an event delegate list for each scene, it has one list. I had assumed otherwise and the GMM was subscribing a delegate to the OnSimulatorFeaturesRequest for every region it was enabled on. I discovered that for one call to OnSimulatorFeaturesRequest, we were receiving multiple calls into our delegate. I have corrected this in the GMM and mesh objects now render properly, but I still believe there is a bug in this code as it is not thread safe.
(0030209)
Mata Hari (reporter)
2016-04-24 04:26

Is this potentially the same cause that leads to the issue where when someone wearing TPs into a region, some of the avatars already present in the region may fail to display it? (http://opensimulator.org/mantis/view.php?id=7403 [^])
(0030229)
colosi (reporter)
2016-04-26 12:46

I have modified the GMM to no longer use ShouldSend and I plan to submit a patch for a modification to the SimulatorFeaturesModule to pass the regionID to the OnSimulatorFeaturesRequest event. However, I still believe there are thread safety issues with the ShouldSend function for anything continuing to use it.
(0030232)
Mandarinka Tasty (reporter)
2016-04-26 13:44

Hi Colosi.

Your analysis and offered patch look interesting :) Though i have such question, i wonder:

How much regions do You run using one simulator process = one instance ?

Do you operate, assume, 2 or more regions , using one instance ?

If You set every individual region on its own sim process, do you experience analogous issues ?
(0030233)
colosi (reporter)
2016-04-26 14:39

Mandarinka,
I believe that a single process can run up to 16 regions/scenes. I tested with 3 regions on a single process. One of our beta testers had more than 3, but I don't know the specific number.

If you had each process only run a single region, then it is possible that you would not have any problems. Regarding the patch I submitted, it would be unnecessary because for a shared region module, you would never have to wonder which region it referred to, as there is only one for the process. In regards to thread safety with the ShouldSend call, it would absolutely be less likely, but I'm not sure if it would disappear. You wouldn't get a conflict with the same agent, but you could get multiple events overlapping on threads when multiple agents log in at the same time. I'm not sure if that would cause the same thread issues or not as the Add call is done on the inner map (by agent id). This is all moot though, as you should be able to run multiple regions on the same OpenSim process without running into threading issues.
(0030235)
Mandarinka Tasty (reporter)
2016-04-26 15:21

Colosi, well I admit = I confess, that i was feeling, after reading your patch, that You run mroe than one region on the specified instance.

I agree, in the theory, we can run many regions on one instance.

But practically, regardless your interesting patch and offering solution for ShouldSend, assumption that we run more than region on one instance is purely theoretical, due to common knowledge, that such solution does not carry nice performance.

Why I mention about it, because I know what you are dealing with: GMM Project.

The target of residents, who are going to use GMM is grid owner, rather in commercial aspects.

And i personally, really doubt, that they run more than one region on one instance, especially if such region should be GMM enabled = business things.

i was always warned to not run more than one region on one process, unless

I do not care about performance.

But those are my purely free considerations. In the theory, you are right:

we should be able to run multiple regions on same process.
(0030236)
colosi (reporter)
2016-04-26 15:26

Mandarinka,

I appreciate the feedback regarding likely ways people will run their grids if they are enabling commerce. I'll reach out to our beta testers to ask how the generally manage their grids. We definitely have beta testing grids running multiple regions with the GMM enabled on the same sim process, but they may only be doing so for thorough testing purposes. Regardless, it's still a scenario for which we'd like our system to work so that we're as broad a solution as possible.

Thanks,
--Chris
(0031019)
UbitUmarov (administrator)
2016-08-13 17:41

" Modules appending to the features are instructed to use the SimulatorFeaturesHelper.ShouldSend() func to determine when they should act"

where did you found this instruction ?
(0031020)
UbitUmarov (administrator)
2016-08-13 20:36

ShouldSend() was added as part of some rare used modules.
it should not be used elsewhere, in fact not even on those
So I just removed it completely.

in master SimulatorFeaturesModule is a NonShared module because I did notice those issues and others while testing the "big merge" on the several regions per instance case.
By original design, all regions in same instance should have the same features.
so there was no provision to tell about what region...
(sorry no relation to 7403)
(0031021)
UbitUmarov (administrator)
2016-08-13 20:49

minor correction, it wasn't me who made the module NonShared :)
(0031030)
colosi (reporter)
2016-08-15 10:44

I'm pretty sure I was given this instruction from someone in the OpenSim-dev IRC channel.

I'm also pretty sure I removed the use of ShouldSend from the Gloebit Money Module as it was not actually doing what I was told it would do.

It is reasonable to to make a rule that all regions on the same sim should have the same features and add-on modules. If this is the intended future, it should be documented somewhere. At the moment, we have designed the GMM so that it can be enabled on some regions and not on others on the same sim and so that it would not interfere with a region on the same sim with a different money module enabled. During information gathering, we were told that some people would want to enable a money module on a per-region basis. However, for those users, they may only run one region on a sim and may have been conflating per-region with per-sim.
(0031031)
UbitUmarov (administrator)
2016-08-15 13:59

That rule should be assumed i.e.

- All regions on same instance should report same simulator features.

Even if current code does seem to imply otherwise. (and some optional modules do seem to do it otherwise)
That's something under revision (those modules may be changed).
(0032230)
colosi (reporter)
2017-07-31 16:41

ShouldSend was removed. This was causing the problem.

- Issue History
Date Modified Username Field Change
2016-04-22 13:18 colosi New Issue
2016-04-23 18:27 colosi Note Added: 0030208
2016-04-24 04:26 Mata Hari Note Added: 0030209
2016-04-26 12:46 colosi Note Added: 0030229
2016-04-26 13:44 Mandarinka Tasty Note Added: 0030232
2016-04-26 14:39 colosi Note Added: 0030233
2016-04-26 15:21 Mandarinka Tasty Note Added: 0030235
2016-04-26 15:26 colosi Note Added: 0030236
2016-08-13 17:41 UbitUmarov Note Added: 0031019
2016-08-13 20:36 UbitUmarov Note Added: 0031020
2016-08-13 20:49 UbitUmarov Note Added: 0031021
2016-08-15 10:44 colosi Note Added: 0031030
2016-08-15 13:59 UbitUmarov Note Added: 0031031
2017-07-31 16:41 colosi Note Added: 0032230
2017-07-31 16:41 colosi Status new => resolved
2017-07-31 16:41 colosi Resolution open => fixed
2017-07-31 16:41 colosi Assigned To => colosi


Copyright © 2000 - 2012 MantisBT Group
Powered by Mantis Bugtracker