Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006435opensim[REGION] OpenSim Corepublic2012-11-15 06:472014-01-25 10:44
Reporterkenvc 
Assigned Tokenvc 
PrioritynormalSeverityminorReproducibilitysometimes
StatusclosedResolutionfixed 
PlatformQuad core 2.8ghz 8 gig ramOSWindowsOS Version7 64 bit pro
Product Versionmaster (dev code) 
Target VersionFixed in Versionmaster (dev code) 
Summary0006435: Watchdog timeouts for various threads - with no load and no activity in any sims since startup on standalone.
DescriptionHave 1 instance running several sims standalone without hypergrid on a fast computer. Started the simulaor up and just let it sit and run several hours with no users logging in and no activity on any sims of any kind. No other external activity on the computer either. Log file shows how long it had been running.

A few hours later, red watchdog timeout errors start showing up for various threads in the console. See attached section from the log file.
Steps To ReproduceStart up the sims in a stand-alone enviroinment and wait.
Additional InformationSee attached error log for more details, but here are a few of the error messages:

2012-11-15 08:03:07,739 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "AsyncLSLCmdHandlerThread". ThreadState=Background, WaitSleepJoin. Last tick was 5553ms ago.
2012-11-15 08:03:10,251 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island A2)". ThreadState=Background, WaitSleepJoin. Last tick was 5803ms ago.
2012-11-15 08:03:10,251 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island A4)". ThreadState=Background, WaitSleepJoin. Last tick was 5803ms ago.
2012-11-15 08:03:10,251 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island A6)". ThreadState=Background, WaitSleepJoin. Last tick was 5803ms ago.
2012-11-15 08:03:10,251 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island A7)". ThreadState=Background, WaitSleepJoin. Last tick was 5803ms ago.
TagsNo tags attached.
Git Revision or version numberr21061
Run Mode Standalone (Multiple Regions)
Physics EngineODE
Environment.NET / Windows64
Mono VersionNone
ViewerN/A
Attached Fileslog file icon OpensimError.log [^] (11,726 bytes) 2012-11-15 06:47

- Relationships

-  Notes
(0023136)
kenvc (reporter)
2012-11-15 06:58
edited on: 2012-11-16 08:02

I also see these same errors occassionally in instances that are running in grid mode connected to OSGrid, so the issue is not associated only to Standalones. When running in grid mode, a red MySQL timeout message sometimes shows on the console while it was trying to store something right after one of the messages that are described earlier in this mantis.

(0023150)
justincc (administrator)
2012-11-16 17:43

I presume you are not running on an EC2 or similar cloud instance? I have occasionally seen random delays like this on such systems - probably an artifact of virtualization.

If not, is there anything else that could delay the system, such as different processes using a large amount of memory/cpu? Are you using a local database?

Otherwise I don't have a good explanation for these delays - long-lived threads regularly check in with the Watchdog and these errors get logged if they fail to do so for more than 5 seconds.

OSGrid may be a different matter, particularly if you are also seeing MySQL timeouts (though I'm not sure why you would be seeing these from the grid services - are you sure that's not a local region database)?
(0023164)
kenvc (reporter)
2012-11-18 23:16
edited on: 2012-11-20 15:17

Justin.

I am not using cloud and the database for my standalone setup is local with nothing else running on this particular PC that would consume any processing time.

As far as the instances running on a different PC that are not standalone simulators connected to OSGrid, that PC has nothing running on it other than other opensim instances, but I don't see these MySQL timeouts very often at all, just every few days or so.

(0023439)
kenvc (reporter)
2013-01-23 09:34

Using the very latest dev master code, this issue seems to now be isolated mainly to Watchdog timeouts related to MapItemRequestThread.

After the stand-alone instance has started the following red messages start appearing in the console at regular intervals even with no one logged in and no one even using another app on the computer:

2013-01-23 11:06:24,450 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island 1)". ThreadState=Background. Last tick was 5804ms ago.
2013-01-23 11:06:24,450 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island 2)". ThreadState=Background, WaitSleepJoin. Last tick was 6053ms ago.
2013-01-23 11:06:24,450 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island 3)". ThreadState=Background, WaitSleepJoin. Last tick was 5866ms ago.
2013-01-23 11:06:24,466 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island 4)". ThreadState=Background. Last tick was 5819ms ago.
2013-01-23 11:06:24,466 ERROR - OpenSim.OpenSim [WATCHDOG]: Timeout detected for thread "MapItemRequestThread (Island 5)". ThreadState=Background, WaitSleepJoin. Last tick was 6068ms ago.
(0023457)
Garmin Kawaguichi (reporter)
2013-01-27 08:25

My opnion is that watchdog messages are not for OpenSimulator users but devs.
So we have changed the log messages into debug as:

in OpenSim\Region\Application\OpenSim.cs around Ln 400 in private void WatchdogTimeoutHandler(Watchdog.ThreadWatchdogInfo twi)
m_log.ErrorFormat(... changed to m_log.DebugFormat(...

in OpenSim\Framework\Monitoring\Watchdog.cs around Ln 333 in private static void WatchdogTimerElapsed(object sender, System.Timers.ElapsedEventArgs e)
m_log.WarnFormat(... changed to m_log.DebugFormat(...

This does not correct the problems that cause watchdogs, but improves the readability of the console.

GCI
(0023464)
justincc (administrator)
2013-01-28 19:50

Actually, I would say that these messages are important to everybody since they indicate when a thread is not fulfilling its task for some reason, whether that's due to server overloading, network issues or a bug in OpenSimulator code.

The unhelpful scenario is when these occur due to timing issues with the watchdog code itself, which I don't believe is what is happening here.
(0023474)
Emperor (reporter)
2013-01-30 15:47

Jusin I got that error to. In fact I have had people complain that textures weren't loading correctly and looking at the time they complain and the corresponding time in the logs I find the timeout error also. I even tested this with loading an OAR for the fun of it and found the same problem. I am not sure how these are related as it does not seem that it should be. It might be something however to explore further. I am wondering if its something specific to windows perhaps or if it might be affecting others on linux as well.
(0024163)
kenvc (reporter)
2013-06-29 17:58

This issue appears to have improved some from what it was when I first reported it, but it does still happen at least a few times in every instance 1 day long session even with no activity going on in that instance.
(0024298)
kenvc (reporter)
2013-08-28 21:54

As of dev master r/23558, it appears most of these timeout issues have significantly been reduced most everywhere, but still seeing them on the 3 mega regions I currently have running. The Watchdog heartbeat timeout messages appear in the log file one right after the other and one for each sim in the mega.
(0025060)
kenvc (reporter)
2014-01-25 10:40

I assume this has been resolved at least as best it can. I still see it on large megas, especially if many prims on them at all.

- Issue History
Date Modified Username Field Change
2012-11-15 06:47 kenvc New Issue
2012-11-15 06:47 kenvc File Added: OpensimError.log
2012-11-15 06:58 kenvc Note Added: 0023136
2012-11-15 07:05 kenvc Note Edited: 0023136 View Revisions
2012-11-16 08:02 kenvc Note Edited: 0023136 View Revisions
2012-11-16 17:43 justincc Note Added: 0023150
2012-11-18 23:16 kenvc Note Added: 0023164
2012-11-20 15:17 kenvc Note Edited: 0023164 View Revisions
2013-01-23 09:34 kenvc Note Added: 0023439
2013-01-27 08:25 Garmin Kawaguichi Note Added: 0023457
2013-01-28 19:50 justincc Note Added: 0023464
2013-01-30 15:47 Emperor Note Added: 0023474
2013-05-13 16:33 kenvc Status new => confirmed
2013-06-29 17:58 kenvc Note Added: 0024163
2013-08-28 21:54 kenvc Note Added: 0024298
2014-01-25 10:40 kenvc Note Added: 0025060
2014-01-25 10:40 kenvc Status confirmed => resolved
2014-01-25 10:40 kenvc Fixed in Version => master (dev code)
2014-01-25 10:40 kenvc Resolution open => fixed
2014-01-25 10:40 kenvc Assigned To => kenvc
2014-01-25 10:44 kenvc Status resolved => closed


Copyright © 2000 - 2012 MantisBT Group
Powered by Mantis Bugtracker