MantisBT - opensim
View Issue Details
0008465opensim[REGION] OpenSim Corepublic2019-01-28 02:472019-02-05 12:27
mewtwo0641 
mewtwo0641 
normalmajoralways
closedfixed 
 
master (dev code) 
Grid (Multiple Regions per Sim)
ubODE
.NET / Windows64
None
0008465: Nothing loads, nothing rezzes, and can't move
As of master commit f10766 nothing loads, nothing rezzes, and the user can't move upon login. All one can see is a blank region and the base avatar (attachments won't load either). Inventory appears to load but nothing else can be done.

Tried relogging but after logging out it's impossible to log back in without restarting OpenSim.
Git Bisect Results:

4c79a85621f9aa37a327239dce9690100708dc84 is the first bad commit
commit 4c79a85621f9aa37a327239dce9690100708dc84
Author: UbitUmarov <ajlduarte@sapo.pt>
Date: Mon Jan 28 03:37:54 2019 +0000

    forgotten locks on ubode

:040000 040000 0e906c56f1c477dc299ff6c0db260fa73f07eca4 fce0d26cb7be8bcc01c76ad0988863355bcefa60 M OpenSim
No tags attached.
Issue History
2019-01-28 02:47mewtwo0641New Issue
2019-01-28 06:27tampaNote Added: 0033968
2019-01-28 07:10Monamusa KaliopovNote Added: 0033969
2019-01-28 07:10Monamusa KaliopovNote Edited: 0033969bug_revision_view_page.php?bugnote_id=33969#r7663
2019-01-28 07:29mewtwo0641Note Added: 0033970
2019-01-28 07:34Monamusa KaliopovNote Added: 0033971
2019-01-28 07:35Monamusa KaliopovNote Edited: 0033971bug_revision_view_page.php?bugnote_id=33971#r7665
2019-01-28 07:36Monamusa KaliopovNote Edited: 0033971bug_revision_view_page.php?bugnote_id=33971#r7666
2019-01-28 07:38Monamusa KaliopovNote Edited: 0033971bug_revision_view_page.php?bugnote_id=33971#r7667
2019-01-28 07:44Monamusa KaliopovNote Edited: 0033971bug_revision_view_page.php?bugnote_id=33971#r7668
2019-01-28 07:46mewtwo0641Note Added: 0033972
2019-01-28 07:47mewtwo0641Note Edited: 0033972bug_revision_view_page.php?bugnote_id=33972#r7672
2019-01-28 07:51paela argusNote Added: 0033973
2019-01-28 07:58paela argusNote Added: 0033974
2019-01-28 08:03Monamusa KaliopovNote Added: 0033975
2019-01-28 08:31tampaNote Added: 0033976
2019-01-28 08:31tampaNote Edited: 0033976bug_revision_view_page.php?bugnote_id=33976#r7674
2019-01-28 08:34tampaNote Edited: 0033976bug_revision_view_page.php?bugnote_id=33976#r7675
2019-01-28 13:15paela argusNote Added: 0033979
2019-01-28 13:17paela argusNote Added: 0033980
2019-01-28 13:20BillBlightNote Added: 0033981
2019-01-28 13:21paela argusNote Added: 0033982
2019-01-28 13:23paela argusNote Added: 0033983
2019-01-28 13:28BillBlightNote Added: 0033984
2019-01-28 17:35mewtwo0641Note Added: 0033985
2019-01-28 20:15mewtwo0641Additional Information Updatedbug_revision_view_page.php?rev_id=7684#r7684
2019-01-28 20:18mewtwo0641Note Added: 0033986
2019-01-28 21:06mewtwo0641Note Added: 0033987
2019-01-28 21:15Monamusa KaliopovNote Added: 0033988
2019-01-28 21:15Monamusa KaliopovNote Edited: 0033988bug_revision_view_page.php?bugnote_id=33988#r7686
2019-01-28 21:23Monamusa KaliopovNote Added: 0033989
2019-01-28 22:00Monamusa KaliopovNote Added: 0033992
2019-01-28 22:19Monamusa KaliopovNote Added: 0033993
2019-01-28 22:20Monamusa KaliopovNote Edited: 0033993bug_revision_view_page.php?bugnote_id=33993#r7688
2019-01-28 22:21Monamusa KaliopovNote Edited: 0033993bug_revision_view_page.php?bugnote_id=33993#r7689
2019-01-28 22:47tampaNote Added: 0033995
2019-01-28 22:49Monamusa KaliopovNote Added: 0033996
2019-01-28 22:51Monamusa KaliopovNote Edited: 0033996bug_revision_view_page.php?bugnote_id=33996#r7694
2019-01-28 22:52Monamusa KaliopovNote Edited: 0033996bug_revision_view_page.php?bugnote_id=33996#r7695
2019-01-28 22:59Monamusa KaliopovNote Added: 0033997
2019-01-28 23:36Monamusa KaliopovNote Edited: 0033997bug_revision_view_page.php?bugnote_id=33997#r7697
2019-01-29 00:01Monamusa KaliopovNote Added: 0033999
2019-01-29 00:02Monamusa KaliopovNote Edited: 0033999bug_revision_view_page.php?bugnote_id=33999#r7701
2019-01-29 01:04Monamusa KaliopovNote Added: 0034002
2019-01-29 06:13UbitUmarovNote Added: 0034014
2019-01-29 18:10mewtwo0641Note Added: 0034016
2019-02-03 18:01mewtwo0641Statusnew => resolved
2019-02-03 18:01mewtwo0641Fixed in Version => master (dev code)
2019-02-03 18:01mewtwo0641Resolutionopen => fixed
2019-02-03 18:01mewtwo0641Assigned To => mewtwo0641
2019-02-05 12:27BillBlightNote Added: 0034305
2019-02-05 12:27BillBlightStatusresolved => closed

Notes
(0033968)
tampa   
2019-01-28 06:27   
Unable to reproduce, you sure this is not an issue elsewhere?
(0033969)
Monamusa Kaliopov   
2019-01-28 07:10   
you have test the newest Master? Ubit fix many this Morning.

(0033970)
mewtwo0641   
2019-01-28 07:29   
Yes, testing on very latest master. I hadn't made any changes to anything in my setup except for pull latest code commits and compile. I'll try again and verify if it wasn't just some weird oddity.
(0033971)
Monamusa Kaliopov   
2019-01-28 07:34   
(edited on: 2019-01-28 07:44)
I also had problems that the region including the console is frozen no matter which build osGrid or master, even if you have a slightly different problem, it has things that remember my problem.

I did not change the configuration either ... the problems would be there.

after Ubits last patch the Region runs over 6 Hour without frozzen, i hope the last patch have fix that Problem.

What I've noticed yet .... I have a grid lamp on the region since I had never seen fire ... One day after Ubits last patch there is fire inside ... I get bigger eyes every hour :)

(0033972)
mewtwo0641   
2019-01-28 07:46   
(edited on: 2019-01-28 07:47)
Tried commit f10766 (Very latest commit as of this writing) again and it is still an issue for me. Went back to commit 695d80 (January 26, 2019) and all is fine again.

One thing that I have noticed though is that once all the regions are loaded I get a bunch of watchdog messages saying that the threads have timed out for each region. CPU use is very low, hovering between 0 percent and 3 percent for OpenSim.exe process and 0 percent - 5 percent overall CPU use. RAM use is low-moderate at appx 600 MB for OpenSim.exe and the overall system RAM use is only appx 40 percent.

It almost seems as if the region threads just died and went unresponsive after loading up.

Watchdog Messages:

09:36:07 - [WATCHDOG]: Timeout detected for thread "Heartbeat-(Test_Region_2)".
ThreadState=WaitSleepJoin. Last tick was 20375ms ago.

(0033973)
paela argus   
2019-01-28 07:51   
please add on head of your script startup this if you are on linux:

export MONO_THREADS_PER_CPU=1024
and give me if is better after
(0033974)
paela argus   
2019-01-28 07:58   
Monomusa are you in the lasted os of Debian if yes do not use the mono of the Os you need compile it by source of mono and apply the export if you need a example i can give you one in Osgrid
(0033975)
Monamusa Kaliopov   
2019-01-28 08:03   
@paela

i have compile by source of mono i will apply the Export so i restart the Simulator or the Simulator are frozen again.

thank you paela .....
(0033976)
tampa   
2019-01-28 08:31   
(edited on: 2019-01-28 08:34)
@mewtwo0641 That watchdog message is evidently normal, it does not mean the region is stuck. So long as it does not repeat itself constantly it should be fine. I cannot reproduce the problem using the latest master, where are you testing this on? Have you ruled out any other incompatibility with recent changes to master? Such a severe case with such minor changes to the code seems unlikely and I suspect the issue is elsewhere in configuration or perhaps in connection to the grid.

@paela argus Setting that environment variable this high is asking for trouble, it should not exceed 512, anything higher may cause system instability and will degrade performance to the point of system lockups. It is recommend to start OpenSim on Unix using --desktop parameter for mono as it configures mono in a more suitable manner. There should be no need to set any other variables or change system configuration.

@Monamusa Kaliopov OSGrid is a rather bad testing environment as the grid itself performs fairly poorly and does completely follow standard setup procedures(as otherwise it would not work at all), so when running latest master code, before reporting issues, please test in a stale environment as either standalone or grid configuration, otherwise there is no way to rule out issues on the grid end.

These recent tickets sound really weird, especially since I have not been able to reproduce or even so much as see a decline in performance over previous master code. In fact things are working better now. Then again, running on Ubuntu with mono 5.4.1.7 from tarball rather than compiled, I worry that there is a system incompatibility showing its ugly head here or perhaps even a module no longer compatible with master(even though all the ones I use compile fine and seem to work). Please add further information on environment(OS, mono version, startup parameters), configuration and so on as otherwise this will be impossible to debug.

I also been told possibly OSGrid having an outage, so reverting binaries maybe you are using cache instead of actual asset server, which appears to be down at the moment.

(0033979)
paela argus   
2019-01-28 13:15   
Tampa that depend the physical server you use personnaly in a 2X 10 core 20 ht 256 gio of ram isnt a problem !!!
(0033980)
paela argus   
2019-01-28 13:17   
@tampa you speak about osgrid but you know nothing i see Osgrid do not use a standard config and is not new sorry for read that but you have not right in all case about osgrid,
(0033981)
BillBlight   
2019-01-28 13:20   
@paela if you are talking about the --desktop switch that makes a big difference in opensimulator regardless of the hardware you run on, not using the --desktop switch makes the gc a lot more lazy and will not clean up memory as often, so for the design of opensimulator the --desktop switch makes opensimulator a lot happier ..
(0033982)
paela argus   
2019-01-28 13:21   
@tampa Use of linux shows that you have no skills on linux because you have to know that ubuntu is the only unstable linux in server mode ...
Stop criticizing Osgrid when you do not know how Osgrid works, besides I'm not sure you have a technique to understand how load balancing works with opensim, and yes because that is what I correct you if you use OpenSim classic the grid would be disconnected immediately !!! Knowledge begins when you start to be interested in what it does not seem to be your case when you say without the Osgrid structure
(0033983)
paela argus   
2019-01-28 13:23   
@bill that depend the configuration of your server yes if you use the default config of Linux that sure --desktop help in real configuration optimised that change nothing ...
(0033984)
BillBlight   
2019-01-28 13:28   
@paela, and exactly what is a "real" configuration, been doing servers for 25+ years I'd really like to know, since you seem to know best ..
(0033985)
mewtwo0641   
2019-01-28 17:35   
I am testing these changes on a local system. That is, on LAN and not over OSGrid or WAN in general. Everything is local to the system being tested on, OS, ROBUST, MySQL, and even the viewer being used is all on the same system.

The reason I mentioned the timeouts showing up is because they show up consistently upon starting up OpenSim (Right after every region loads up fully) every single time; where going back to commit 695d80 which is only a couple days older in master the timeouts don't happen and everything works fine and I don't experience this mantis's issue.

Looking through the changes between commits f10766 and 695d80 I don't see any changes to the code that would require new or different configuration settings in the .ini files so I am very confused.

I am going to spend a little time and see if I can track down the exact commit it started to happen on and I'll update the mantis when/if I find it.
(0033986)
mewtwo0641   
2019-01-28 20:18   
I've tracked the issue down to commit 4c79a8 where it first appears and updated the mantis "Additional Information" section with the git bisect results. Upon reverting this commit things work fine again and no freeze ups.
(0033987)
mewtwo0641   
2019-01-28 21:06   
A bit more to the previous note

Prior to reverting commit 4c79a8; I am using ubODE when I see the issue. Switching to the original ODE is a bit better, things load up, but still can't move; and switching to BulletSim everything is fine, things load up and I can move.
(0033988)
Monamusa Kaliopov   
2019-01-28 21:15   
today i comes online and after 1 Min Region and Console frozen

thats was 10 Min ago

(0033989)
Monamusa Kaliopov   
2019-01-28 21:23   
06:04:12 - [DATASNAPSHOT]: Received collection request for all
06:04:12 - [DATASNAPSHOT]: Marking all scenes as stale.
06:04:12 - [DATASNAPSHOT]: Data requested for scene MOMVALIII
06:04:12 - [DATASNAPSHOT]: Attempting to generate snapshot.
06:04:12 - [DATASNAPSHOT]: Generated region node
06:04:12 - [DATASNAPSHOT]: Got grid snapshot data
06:04:12 - [DATASNAPSHOT]: Retrieved fragment response for provider type EstateSnapshot
06:04:12 - [DATASNAPSHOT]: Generated fragment response for provider type LandSnapshot
06:04:12 - [DATASNAPSHOT]: Retrieved fragment response for provider type ObjectSnapshot
06:04:12 - [DATASNAPSHOT]: Generated new snapshot for MOMVALIII
06:04:12 - [DATASNAPSHOT]: Data requested for scene MOMVALII
06:04:12 - [DATASNAPSHOT]: Attempting to generate snapshot.
06:04:12 - [DATASNAPSHOT]: Generated region node
06:04:12 - [DATASNAPSHOT]: Got grid snapshot data
06:04:12 - [DATASNAPSHOT]: Retrieved fragment response for provider type EstateSnapshot
06:04:12 - [DATASNAPSHOT]: Generated fragment response for provider type LandSnapshot
06:04:12 - [DATASNAPSHOT]: Retrieved fragment response for provider type ObjectSnapshot
06:04:12 - [DATASNAPSHOT]: Generated new snapshot for MOMVALII
06:04:42 - [AGENT HANDLER]: QueryAccess returned True (). Version=0, 0.6/0.6
06:04:42 - [SCENE]: Region MOMVALIII told of incoming root agent Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002 (circuit code 264479955, IP 213.180.187.25, viewer Firestorm-Betax64 6.0.1.56538, teleportflags (ViaHome, ViaLogin), position <136.006, 110.2776, 19.54066>.
06:04:43 - [SCENE]: Region MOMVALIII authenticated and authorized incoming root agent Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002 (circuit code 264479955)
06:04:43 - [CreateCaps]: new caps agent c1ebb868-dc28-43f8-a0cb-004ace551002, circuit 264479955, path 6386aacb-57cd-4eeb-a743-687da378e2a5, time 0
06:04:46 - [CAPS]: Received SEED caps request in MOMVALIII for agent c1ebb868-dc28-43f8-a0cb-004ace551002
06:04:46 - [SCENE]: Incoming client Munala Kaliopov in region MOMVALIII via regular login. Client IP verification not performed.
06:04:47 - [LLUDPSERVER]: Handling UseCircuitCode request for circuit 264479955 to MOMVALIII from IP 213.180.187.25:53219
06:04:47 - [SCENE]: Adding new child scene presence Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002 to scene MOMVALIII at pos <136.006, 110.2776, 19.54066>, tpflags: ViaHome, ViaLogin
06:04:49 - [LLUDPSERVER]: Client created, processing pending queue, 4 entries
06:04:49 - [LLClientView]: HandleCompleteAgentMovement
06:04:49 - [SCENE PRESENCE]: Completing movement of Munala Kaliopov into region MOMVALIII in position <136.006, 110.2776, 19.54066>
06:04:51 - [CompleteMovement]: Missing COF for c1ebb868-dc28-43f8-a0cb-004ace551002 is 222dc8b9-23b9-441c-a7c5-4351f00afe7d
06:04:51 - [AgentPrefs]: UpdateAgentPreferences for c1ebb868-dc28-43f8-a0cb-004ace551002
06:04:51 - [XBakes]: read 5 textures for user c1ebb868-dc28-43f8-a0cb-004ace551002
06:04:51 - [ValidateBakedCache]: got bakedModule 5 cached textures
06:04:52 - [LLUDPSERVER]: Packet exceeded buffer size! This could be an indication of packet assembly not obeying the MTU. Type=LayerData, DataLength=1412, BufferLength=1400
06:04:53 - [ATTACHMENTS MODULE]: Could not retrieve item 5bd454c4-37bb-4e98-80a8-4cafbd620830 for attaching to avatar Munala Kaliopov at point 40
06:04:53 - [CompleteMovement]: end: 3980ms
06:04:53 - [OFFLINE MESSAGING]: Retrieving stored messages for c1ebb868-dc28-43f8-a0cb-004ace551002
06:04:53 - [ENTITY TRANSFER MODULE]: Informing Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002 about neighbour MOMVALII 138.201.137.82:9052 at (9959,10045)
06:04:53 - [SCENE]: Region MOMVALII told of incoming child agent Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002 (circuit code 264479955, IP 213.180.187.25, viewer Firestorm-Betax64 6.0.1.56538, teleportflags (Default), position <136.006, -145.7224, 20.66199>. From region MOMVALII (b4d9d561-1dd2-4633-91c2-3410365cc4d5) @ http://138.201.137.82:9000/ [^]
06:04:53 - [SCENE]: Region MOMVALII authenticated and authorized incoming child agent Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002 (circuit code 264479955)
06:04:53 - [CreateCaps]: new caps agent c1ebb868-dc28-43f8-a0cb-004ace551002, circuit 264479955, path bf00cae6-ea82-47ce-895e-a76fcc6d1f26, time 0
06:04:54 - [ENTITY TRANSFER MODULE]: MOMVALIII is sending Munala Kaliopov EnableSimulator for neighbour region MOMVALII(loc=<2549504,2571520>,siz=<256,256>) and EstablishAgentCommunication with seed cap http://138.201.137.82:9000/CAPS/bf00cae6-ea82-47ce-895e-a76fcc6d1f260000/ [^]
06:04:54 - [ENTITY TRANSFER MODULE]: Completed inform Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002 about neighbour 138.201.137.82:9052
06:04:54 - [LLUDPSERVER]: Handling UseCircuitCode request for circuit 264479955 to MOMVALII from IP 213.180.187.25:53219
06:04:54 - [CAPS]: Received SEED caps request in MOMVALII for agent c1ebb868-dc28-43f8-a0cb-004ace551002
06:04:54 - [SCENE]: Adding new child scene presence Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002 to scene MOMVALII at pos <136.006, -145.7224, 20.66199>, tpflags: Default
06:04:54 - [LLUDPSERVER]: Client created, processing pending queue, 2 entries
06:04:56 - [AGENT INVENTORY]: Moving 1 items for user c1ebb868-dc28-43f8-a0cb-004ace551002
06:04:56 - [AGENT INVENTORY]: Moving 1 items for user c1ebb868-dc28-43f8-a0cb-004ace551002
06:04:57 - [AVFACTORY]: Received texture update for Munala Kaliopov c1ebb868-dc28-43f8-a0cb-004ace551002
06:04:57 - [UpdateBakedCache]: cache hits: 5 changed entries: 0 rebakes 0
06:04:58 - [AgentPrefs]: UpdateAgentPreferences for c1ebb868-dc28-43f8-a0cb-004ace551002
(0033992)
Monamusa Kaliopov   
2019-01-28 22:00   
for testing i start OpenSim only with ./opensim.sh the Simulator starts .. but i want login with a Avatar the Simulator frozen immediately
(0033993)
Monamusa Kaliopov   
2019-01-28 22:19   
(edited on: 2019-01-28 22:21)
because the freezing takes place in all sorts of situations ... can it be that the server (memory, hard disk, Network, CPU or any other part is broken? )

(0033995)
tampa   
2019-01-28 22:47   
@paela argus I have been around OpenSim for nearly 10 years now and have seen the rise and fall of OSGrid along with their changes in setup and even had quite extensive talks with a number of folks over the years who ran OSGrid, trust me when I say that OSGrid has an unusual setup compared to the vast majority of grids out there, pushing the boundaries of the software and hardware underneath till breaking point. The frequent outages and slow response times are testament enough that under the load OpenSim simply strains. I can even see beginnings of that with some of my long-term customers accumulating vast amounts of assets, inventory and friends. Testing on OSGrid for performance metrics is thus difficult to relate to other grids and the variables and "daily" performance differences often make difficult to determine which end is to blame. Much like Bill I have been working in the industry for many years now, dealing exclusively with OpenSim in a commercial manner for over half a decade and while I may not code much, I have probably read through the codebase twice at this point in an effort to understand how it works and what may affect its performance. So when I recommend something I only do it after having done my own testing. I find your remarks and assumptions to be rather rude and out of place.

@Monamusa Kaliopov Have you tested just with your avatar or with another avatar as well?
(0033996)
Monamusa Kaliopov   
2019-01-28 22:49   
(edited on: 2019-01-28 22:52)
@tampa all Avatars and random frozen the last frozen was direct in login process and 20 seconds after the simulator starts up

tested with 4 Avatars one time a frozen comes after 1 Hour next 1 Day later and now direct after the start up

(0033997)
Monamusa Kaliopov   
2019-01-28 22:59   
(edited on: 2019-01-28 23:36)
Strangely, all the OSGrid builds and masters went first ... suddenly the console is frozen .. after a restart I had about 3 - 4 days rest ... after that the console is frozen every day

on screen it is not .. because it is synonymous with opensim.sh and if you enter mono - desktop directly enters

(0033999)
Monamusa Kaliopov   
2019-01-29 00:01   
(edited on: 2019-01-29 00:02)
I often see that too lately

[LLUDPSERVER]: Packet exceeded buffer size! This could be an indication of packet assembly not the MTU. Type = LayerData, DataLength = 1412, BufferLength = 1400

and frozen again

(0034002)
Monamusa Kaliopov   
2019-01-29 01:04   
I can not help it lies the idea either udp / tcp or hardware
(0034014)
UbitUmarov   
2019-01-29 06:13   
another thing I can't repo, on Linux or win, both sims running 2 regions now.
possible a dif set of features/configuration :(

I should remember that ODE, both and new never liked multiple regions per instance.

in fact those still have several issues, some I did tried to fix.
for example do not try to shutdown just one of those regions...

Over the years most devs only worked with a single region per instance, so issues pilled up on top of already weak original support.

ODE issue is that heartbeats of the diferente regions will wait while one is on ode inner work. With low loads, they will eventually time drift, so will reach those sections in diferent time, reason why all seem to work so far.

my last changes on ubode locking where to replace a lock on a larger section of code, but several on smaller critical sections, so regions work can interleave better.

And that is working ok for me :(
(0034016)
mewtwo0641   
2019-01-29 18:10   
It's odd because I haven't had any (apparent) issues running ODE/ubODE multi-region per instance prior to the commit mentioned in the mantis; at least nothing major like complete freeze up while OpenSim is loading up or the world just not loading at all / frozen avatars.

I did check the latest master just now and it seems to be a lot better. OS loads up without hanging and things load for me in world and I can move around now :) I will keep this open for a bit while I test more extensively to be sure.

Thanks Ubit!
(0034305)
BillBlight   
2019-02-05 12:27   
Old Issues, closed, can be reopened if they still exist