Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007593opensim[REGION] OpenSim Corepublic2015-05-29 12:172017-07-01 16:52
Reporterslow putzo 
Assigned Toaiaustin 
PriorityhighSeveritymajorReproducibilityalways
StatusclosedResolutionfixed 
PlatformLinuz 64 bit AMD 8 coreOSFedora OS Version19,20,21
Product Versionmaster (dev code) 
Target VersionFixed in Versionmaster (dev code) 
Summary0007593: High memory usage when avatar arrives and is never released
DescriptionI tested the following 10 versions of opensim built for the OSgrid.

The times are from the monitor program "monit",and are the uptime.
The test process was;
Place a fresh copy of opensim as distributed, copy the needed .ini files into the simulator,
Restart the region. (This causes "monit" to detect a new PID and resets it's "uptime" to be zero)
7 minute tick, login with "coolvl" with standard unscripted avatar, record metrics.
10 minute tick, log off, record metrics.
15 minute tick record metrics.
The 7 minute tick metrics represents the IDLE usage of a freshly started and un-visited region.
(the reason for waiting 7 minutes was to give ample time to compile the 626 scripts as an initial load)
The 10 minute tick metrics represents the usage after an avatar has logged in and been on the region for approximately 3 minutes.
The 15 minute tick represents the usage 5 minutes after an avatar has left the region and no other usage of the region has taken place.

Region used: Sanctuary Prims= 7985 Scripts= 626
------------------------------------------------------------------- idle 1 avatar avatar leave
-------------------------------------------------------- Time mem kb CPU% Time mem kb CPU% Time mem kb CPU%
osgrid-opensim-01162015.v0.8.1.97ac80d 7m 574952k 3.0
osgrid-opensim-01172015.v0.8.1.1f04e1b 7m 585076k 2.8 27m 513324k 2.9 30m 513364k 2.6
osgrid-opensim-02212015.v0.8.1.7b9ad11 7m 565480k 2.5 10m 560624k 2.9 15m 536468k 2.5
osgrid-opensim-03142015.v0.8.1.8b13e4e 7m 568560k 2.6 10m 576828k 2.9 15m 567436k 2.5
osgrid-opensim-03172015.v0.8.2.83e58eb 7m 580736k 2.4 10m 583992k 2.8 15m 572664k 2.5
osgrid-opensim-04052015.v0.8.2.8d66284 7m 564208k 2.5 10m 574464k 2.8 15m 574704k 2.5
osgrid-opensim-04122015.v0.8.2.d96d31b 7m 574392k 2.6 10m 585156k 3.0 15m 546868k 2.6
osgrid-opensim-05072015.v0.8.2.c74cef0 7m 583924k 2.6 10m 1139092k 2.7 15m 1141972k 2.5
osgrid-opensim-05092015.v0.8.2.adf0f49 7m 578632k 2.5 10m 1207972k 2.9 15m 1208696k 2.5
osgrid-opensim-05232015.v0.8.2.abb3bb6 7m 585384K 2.6 10m 584852k 5.5 15m 1165304k 2.5

Additional comments:

This problem appeared for me in the May 7, 2015 OSgrid release build. It is present in the latest build which was May 23, 2015.
The initial IDLE memory appears to be nearly the same for all versions tested.
Prior to the May 7th release memory usage fluctuated very little when avatars came and went.
After the May 7 release memory usage goes up dramatically, for EACH avatar that arrives and does not go back down.
There was a change on how memory is used between the May 9 and May 23 builds. The first builds used the memory much faster, and CPU usage would rise quickly last for a short time and then drop back down.
In the May 23 release, CPU usage goes up much higher, holds at that high usage level much longer but does drop back to the "idle" level. Memory is slower to be increased, but still rises to over double that of the idle rate and does not drop back to the idle level ever.
The value of AppDomainLoading has been set to "false" in the OSgrid builds for all the builds distributed this year that I have, that is what all of my tests were run with.
However reading that parameter it reads to me that having it set to FALSE would cause this exact problem.
That does not explain why it never caused this problem prior to the May 7th build.

I have "monit" configured to allow both memory and CPU usage to go over a set limit but if it remains above that limit for longer than what I believe to be a reasonable time, in my case 3 minutes, it will restart the region.

From 2011 until May 7th 2015 it was rare any of my regions were restarted by "monit". After the May 9th release "monit" would restart my regions as soon as any avatar visited it. I had set the memory limit to be 1.3GB prior to May 7th. Now setting it at 5GB still results in some regions restarting a few times every day.

Until the May 9th release none of my three servers had ever used the swap file. Now the usage of the swap file is in the mid 20%. Lag has become a real problem on my regions, and system CPU usage is much higher than it ever was prior to the May 7th release.

To suggest nothing is wrong with whatever was done to opensim between the March release I tested and the May 7th release is very difficult to accept.

If the only problem was higher CPU usage I would accept it, but since performance and reliability have also been diminished this is not something that can be ignored.

Memory that gets used and not released is not a good thing. This is happening across three released of Fedora Linux. Fedora 19, 20, and 21, on four different servers.



Steps To Reproducestart a region, have an avatar enter the region

This was discussed on the opensim-users list and several verifications of this problem were reported across platforms. Others experienced no issues.

Most who are experiencing this are using OSgrid, and most of those who do not see a problem are running their own private grids. This information may be misleading as I can only test using OSgrid and the four servers I have. All four started this behavior when I updated opensim to the May 9th build distributed by OSgrid. It has continued to be present in the releases since that time.
Additional InformationCool VL Viewer 1.26.12 (43) May 16 2015 11:09:28 (Cool VL Viewer)
RestrainedLove viewer v2.09.09.21
Release Notes

You are at 2573868.8, 2556497.4, 22.2 in Sanctuary located at Wireless_Broadband_Router (100.13.12.235:9137)
OpenSim 0.8.2.0 Dev OSgrid 0.8.2.0 (Dev) abb3bb6: 2015-05-23 (Unix/Mono)

CPU: AMD Phenom(tm) II X6 1090T Processor (3214.17 MHz)
Memory: 16382 MB
OS version: Microsoft Windows 8.1 64-bit (Build 9600)
Memory manager: OS native
Graphics card vendor: NVIDIA Corporation
Graphics card: GeForce GTX 460/PCIe/SSE2
Windows graphics driver version: 9.18.0013.4752
OpenGL version: 4.5.0 NVIDIA 347.52

J2C decoder: KDU
Audio driver: FMOD Ex 4.44.53
Networking backend: libcurl/7.38.0 OpenSSL/1.0.1h zlib/1.2.8
Embedded browser: Qt Webkit v4.7.1 (version number hard-coded)
Packets lost: 0/832 (0.0%)

Built with MSVC version 1600

Compile flags used for this build:
/O2 /Oi /MD /MP /DNDEBUG /D_SECURE_SCL=0 /D_HAS_ITERATOR_DEBUGGING=0 /DWIN32 /D_WINDOWS /W3 /GR /EHsc /Oy- /GS /arch:SSE2 /fp:fast /TP /W2 /Zc:forScope /Zc:wchar_t- /c /nologo /DLL_WINDOWS=1 /DUNICODE /D_UNICODE /DWINVER=0x0501 /D_WIN32_WINNT=0x0501 /DLL_PRIVATE_MEMORY_POOLS=1 /DLL_VB_MEM_POOL=1 /DLL_VOLUME_MEM_POOL=1 /DCARES_STATICLIB /DLIB_NDOF=1
TagsNo tags attached.
Git Revision or version numberv0.8.2
Run Mode Grid (1 Region per Sim)
Physics EngineBulletSim
EnvironmentMono / Linux64
Mono VersionNone
Viewercool vl
Attached Filespng file icon UtilizationHedonism-20150621.png [^] (91,066 bytes) 2015-07-07 07:06


png file icon Utilization-DarkShadows-20150619.png [^] (67,559 bytes) 2015-07-07 07:06


png file icon Utilization-Nook001-20150627.png [^] (59,409 bytes) 2015-07-07 07:06

- Relationships
related to 0007564closedDiva Inventory download hangs temporarily or permanently 
related to 0007567closedDiva Objects in scene not loading until after avatar and attachments are fully rezzed - change in behaviour at r/25970 

-  Notes
(0028491)
aiaustin (developer)
2015-05-29 13:17
edited on: 2015-05-29 13:24

I can confirm the big step up in memory use when an avatar arrives on a region.. and the memory used does not seem to be freed when the avatar(s) log off. I am not though sure what the previous behaviour was.

As a quick test I used one addon test region on OSGrid and observe these memory usage figures on a Windows 8.1 setup using SQLite as reported by the Windows Task Manager for the OpenSim.exe process.

Idling single region 590MB
1st avatar joins 998MB
2nd avatar joins 1724MB
2nd avatar leaves 1448MB
     after 1 minute idle 1449MB
     after 3 minutes idle 1449MB
     after 5 minutes 1005MB
1st avatar leaves 1048MB
     after 1 minute 1049MB rising slowly
     after 3 minutes 1052MB rising slowly
     after 5 minutes 1053MB
1st avatar logs back in 1302MB
2nd avatar logs back in 1692MB
2nd avatar leaves 1770MB rising slowly
     after 1 minute 1774MB rising slowly
...

(0028492)
aiaustin (developer)
2015-05-29 13:28
edited on: 2015-05-29 14:13

Of course the OSGrid release of 7th May 2015 when you first noticed this problem was the one where a big change in how initial inventory fetch handling was made.. and there are reports of other changes of behaviour on OSGrid which are not seen on other grids related to that same release. So I have linked that as a possible related issue.

(0028495)
nebadon (administrator)
2015-05-29 16:48

I just tried to replicate some of this on my own regions, and I have noticed some of this occurring on the OSgrid plazas. I was unable to observe this behavior on my own regions, even with 12 avatars I could not get memory to significantly climb and remain high. However one thing I did notice is that on the OSgrid plaza regions that this seemed to be occuring we were launching them with mono-boehm and not using sgen garbage collector, I have since switched them all back to sgen to see if this behavior continues, since this is what my personal regions are also running, can you let me know if you are running sgen or boehm with your simulators? if you run > "mono -V" it will display at the bottom for gc:
(0028496)
slow putzo (reporter)
2015-05-29 17:16

I checked what mono is using and it says Boehm on all four servers. I guess that means Fedora uses that by default because I did not do any configurations for mono.

If you can tell me how to change that, I can certainly do it fairly quickly.
(0028500)
aiaustin (developer)
2015-05-30 01:22
edited on: 2015-05-30 01:24

Neb asks about what GC we are using... note I am seeing 500MB added per avatar or so on Windows 8.1 server under Windows/.NET4. I am not sure what GC will be in use on Windows.

But its worth noting that this could not possibly have happened before as the amount of memory used per avatar would have given me errors ope of our 32 bit servers with only 4GB of memory. I will look at the smallest servers we have and look at more typical avatar loads soon to see what is happening there.

(0028502)
slow putzo (reporter)
2015-05-30 06:24

I have good news based on what Neb recommended I give a try.

I updated one server to mono 4.0.1 and ran a test. The results looked good, but that server was not the server I did my previous testing on.
I have since updated three of my four servers to mono 4.0.1 and they by default use sgen GC.

Here are the test results doing nearly the same test as before.

time CPU % Memory
5m 0.4 273936 IDLE
6m 5.3 293592 1 avie logging in
7m 8.4 597668 1 avie on region
8m 0.5 605284 1 avie on region
9m 0.5 606032 1 avie on region
10m 0.4 606168 avie logging off
15m 0.4 607900 region empty and idling after being used once

While memory never went up the huge amounts as before, it does go up, and appears to remain at that level. I will change my monit parameters back to monitor for a memory usage exceeding 1.3GB as I had before and see if that limit is exceeded now using mono 4.0.1

The rise is not troubling, but the appearance that memory continues to rise without dropping back is a little disturbing.

I have been doing a nightly reboot on all my regions for over a year now to help reduce the lag I have always noticed. That reboot also clears any memory "creep" as well.
(0028503)
nebadon (administrator)
2015-05-30 06:42

ok that is good data regardless. My next question is do these avatars have scripted attachments? With my testing I had 15 avatars logged into my sandbox, memory hit about 400-450mb and as soon as I logged them out memory went back down to around 250mb. Some of the avatars had scripted attachments but they were very light did not do much. If you have some heavily scripted objects you might be wearing could you please attach them to this mantis as an IAR file if you are able to share them. Also if you could please remove all attachments and rerun the tests to see if you get the same effect.
(0028504)
slow putzo (reporter)
2015-05-30 06:51

All my tests were with using unscripted avatar. only attachments were shoes and hair. both are unscripted.
I am going to wait about an hour or so and then update my bandit1 server to mono 4.0.1
That is where my home region lives, and I am a bit reluctant to move too fast at updating. It is also the oldest OS as it runs Fedora 19, a web site and FTP site.
(0028505)
Mata Hari (reporter)
2015-05-30 06:58
edited on: 2015-07-07 07:03

I am unable to duplicate these results on my own sandbox in spite of multiple log-ins with my all-mesh avatar & clothing and scripted attachments. Memory use climbs by only ~50MB and drops again after log out. I am seeing nowhere even remotely close to 300MB or 400MB increases. In my case this is a sandbox region attached to Refuge grid running opensim-f053313 (r/26029 2015-05-26) on Win7-64bit under .NET

EDIT: I *am* now able to replicate this with multiple logins over a period of time; although the per-agent-login increase in unreleased memory is only in the order of 10-30MB from what I'm seeing. I am also seeing a corresponding gradual increase in the number of threads being used even when the sim has been sitting dormant/vacant for many hours (through at least 2 fcache cycles).

(0028506)
slow putzo (reporter)
2015-05-30 07:14

One possible explanation for the difference in memory usage reports could be how each of us are measuring memory usage. I am no Linux guru or even a geek, the only tool I am using is monit to measure how much memory is being used. I have it configured to do a snapshot every minute. If there is a better way to measure memory with some standard tool or even something within opensim itself I can retest using something everyone can use as a consistent measurement.
I run each simulator in "terminal mode" using "screen". Each server runs 5 or 6 instances. Each instance only runs 1 region with the exception I run about 35 regions of ocean in a single instance on one of the servers. It is the server that runs 5 instances and nothing else. The oceans are not used, and are for looks only. Occasionally someone spills over on them but any prims placed on them are returned within one minute. All my regions have voice chat activated.
(0028507)
nebadon (administrator)
2015-05-30 07:17

a better way would be to use "show stats" command on the region console, you may not be looking at the correct memory usage, also you might try using the "top" command on linux console and look at the "RES" column, you may be looking at VIRT memory which is not a good indicator of actual memory usage.

http://mugurel.sumanariu.ro/linux/the-difference-among-virt-res-and-shr-in-top-output/ [^]
(0028508)
slow putzo (reporter)
2015-05-30 07:29
edited on: 2015-05-30 07:37

I just used "show stats" on all of the regions on one server and compared that memory reading to what was showing in the "monit" tool. Monit shows each region about 10MB higher than what "show stats" shows.

Monit also displays swap memory which would be virtual memory and it is at zero. In the past it has always remained at zero. It was when swap memory started to be used and monit was restarting regions for no apparent reason saying the memory limit had been reached I started to look into what was going on.
edit: After comparing the "top" memory usage with what is reported in "monit" they match identically. Both are about 10MB higher than what is shown in the console "show stats" memory usage.

Swap memory was being used because the total allocated memory to the instances exceeded real ram in my servers. 8GB of RAM had always been enough before.

(0028509)
aiaustin (developer)
2015-05-30 07:42
edited on: 2015-05-30 07:45

Just a pointer to another side effect of the changes to inventory handling around the 7th May 2015 time scale in Mantis 7567. Again this is only observed on an OSGrid addon region for some reason, not on my own Robust grids on the same versions. There may be a link to the memory usage issue??

http://opensimulator.org/mantis/view.php?id=7567 [^]

In this, a change in OSGrid addon region logins indicates login progress bar stops at 90% and never goes all te eway to 100% as it did before... and that the avatar and HUDs (scripted attachments) load now BEFORE any scene objects. In a test with Mata Hari we think a scripted attachment that called out an external host via HTTP may even have called, and timed oot before the objects appeared.

It may be worth seeing if you can go back to r/25969 and see if the memory usage issue is NOT present in that, but is present in r/25970.

(0028688)
slow putzo (reporter)
2015-06-13 15:23

I have been tracking memory usage now on regions with avatars arriving and leaving and have come to a conclusion that this is not something that happens every single time. I have witnessed times when three avatars entered the region stayed for more than 30 minutes and left with memory usage rising to about double and then fairly quickly receding back to about 1.5 times the original value. At other times these same avatars have arrived spend only 10 minutes on the region dressed exactly the same way and memory usage would jump 3-4 times the original value and take several hours of no activity on the region before it dropped back down to about 1.5 times the original.

I did do a test with older code and memory usage was very stable while avatars arrived and left. There was a small increase of memory usage, but nothing like it is now in the code.

The good thing is that it does now drop back to a reasonable level after a period of non usage.

A server with five instances running with one region in each instance can be expect to easily use up 8GB of ram and go into paging where before the same hardware would support these regions and never use up all 8GB of ram.
(0028872)
Ange Menges (reporter)
2015-07-07 00:07
edited on: 2015-07-08 01:13

I just found this thread and happy to understand I am not alone. I am fighting with this memory not released and found no solution.I use monit like slow putzo.
Intel Core i7-2600 Quad Core 16 GB DDR3 RAM Ubuntu 12.10 Mono 2.10.8.1
OSG opensim-05232015.v0.8.2.abb3bb6
I don't update opensim very often and before I was working with opensim-06062014.v0.8.0.74cda2a without this problem
Fresh empty region 76.696Kb 2 visits with my avatar 773.900kb and no decrease
Actually I restart my regions every day but this is not a real solution.
I don't use Warp3DImageModule which can increase memory use for me. I tried to avoid XBakes also no changes.

(0028873)
Mata Hari (reporter)
2015-07-07 06:43

I have been monitoring this more closely on my own sim for the last little while, aided by there having been no new commits for the last couple of weeks so no need to restart the simulator. Running under Win7-64 .NET4.0 when I first start my own private home region I have the following stats:

MEMORY STATISTICS
Heap allocated to OpenSim : 129 MB
Last heap allocation rate : 0.719 MB/s
Average heap allocation rate: 3.732 MB/s
Process memory : 431 MB
Total threads active: 136

After several days where there is very low traffic (just myself and one other person entering and leaving the region) it has increased to:

MEMORY STATISTICS
Heap allocated to OpenSim : 240 MB
Last heap allocation rate : 7.428 MB/s
Average heap allocation rate: 6.252 MB/s
Process memory : 672 MB
Total threads active: 146


Both set of data, above, are when the sim had no agents in it and has been idle for 6+ hours. This represents a 240MB increase in memory and 10 additional threads where the sim is under identical conditions other than having had "visitors" entering and leaving during the course of several days.
(0028875)
Seth Nygard (reporter)
2015-07-07 07:07

I added a few charts to show the behavior I see in the regions.

UtilizationHedonism-20150621.png is from a sim running on Version: 81ef7b586e 6-Jun-2015 r/26056 and it shows memory increase after avis enter the sim but never returns when they leave.

Utilization-DarkShadows-20150619.png is from a sim running on Version: 81ef7b586e 6-Jun-2015 r/26056 and it shows a major memory increase that happened when an HG visitor from OSGrid was looking through her inventory for something. She experienced a delay in her viewer at that same time. This major increase is particularly troubling.

Utilization-Nook001-20150627.png is from a sim running on 0.8.1 stable and it shows typically a small increase in memory which I think can be expected.


An additional observation is that I also see the number of threads used increase along with memory, which also do not drop back to previous levels. A hard sim restart has been the only way to ensure that memory and thread use is "cleaned up".
(0032108)
slow putzo (reporter)
2017-07-01 13:56

I have not experienced this issue for well over a year. I recommend closing if the dev's agree.
(0032110)
aiaustin (developer)
2017-07-01 14:08

It is believed that this issue no longer is a problem in the latest versions of OpenSim. Please reopen the Mantis issue if it reoccurs.
(0032113)
UbitUmarov (administrator)
2017-07-01 16:52

This will most like still happen
How, when .. depends on might GC.

Like someone pointed out scripts can be a cause.
Our option to use application Domains is basically a fail, any simple nanosecond script operation can take milliseconds with it active.
So without it scripts can't be deleted from memory.
Possible they can't also be moved around, so if allocated at top of memory, they will stay there (in virtual address space)
how that reflects on values shown by memory measure programs depends on those, since there are different metrics.
Also effective impact machine varies.. in same cases those dead areas get swapped out to disk eventually and forgotten. So are more of a problem on the more limited 1.9GB virtual address space of 32Bits, less on the 8TB of 64Bit (but you do pay something for it)

of course True code memory leaks are sure present, 0.8.2 had tons, and im sure master still has hidden ones.
But do far, and ignoring scripts, the main cause of excessive memory use is GC
see mantis 8177

- Issue History
Date Modified Username Field Change
2015-05-29 12:17 slow putzo New Issue
2015-05-29 13:17 aiaustin Note Added: 0028491
2015-05-29 13:18 aiaustin Note Edited: 0028491 View Revisions
2015-05-29 13:22 aiaustin Note Edited: 0028491 View Revisions
2015-05-29 13:24 aiaustin Note Edited: 0028491 View Revisions
2015-05-29 13:28 aiaustin Note Added: 0028492
2015-05-29 13:29 aiaustin Relationship added related to 0007564
2015-05-29 13:32 aiaustin Note Edited: 0028492 View Revisions
2015-05-29 14:11 aiaustin Note Edited: 0028492 View Revisions
2015-05-29 14:11 aiaustin Note Edited: 0028492 View Revisions
2015-05-29 14:12 aiaustin Note Edited: 0028492 View Revisions
2015-05-29 14:13 aiaustin Note Edited: 0028492 View Revisions
2015-05-29 16:48 nebadon Note Added: 0028495
2015-05-29 17:16 slow putzo Note Added: 0028496
2015-05-30 01:22 aiaustin Note Added: 0028500
2015-05-30 01:23 aiaustin Note Edited: 0028500 View Revisions
2015-05-30 01:24 aiaustin Note Edited: 0028500 View Revisions
2015-05-30 06:24 slow putzo Note Added: 0028502
2015-05-30 06:42 nebadon Note Added: 0028503
2015-05-30 06:51 slow putzo Note Added: 0028504
2015-05-30 06:58 Mata Hari Note Added: 0028505
2015-05-30 07:14 slow putzo Note Added: 0028506
2015-05-30 07:17 nebadon Note Added: 0028507
2015-05-30 07:29 slow putzo Note Added: 0028508
2015-05-30 07:37 slow putzo Note Edited: 0028508 View Revisions
2015-05-30 07:42 aiaustin Note Added: 0028509
2015-05-30 07:43 aiaustin Relationship added related to 0007567
2015-05-30 07:44 aiaustin Note Edited: 0028509 View Revisions
2015-05-30 07:45 aiaustin Note Edited: 0028509 View Revisions
2015-06-13 15:23 slow putzo Note Added: 0028688
2015-07-07 00:07 Ange Menges Note Added: 0028872
2015-07-07 06:43 Mata Hari Note Added: 0028873
2015-07-07 07:03 Mata Hari Note Edited: 0028505 View Revisions
2015-07-07 07:06 Seth Nygard File Added: UtilizationHedonism-20150621.png
2015-07-07 07:06 Seth Nygard File Added: Utilization-DarkShadows-20150619.png
2015-07-07 07:06 Seth Nygard File Added: Utilization-Nook001-20150627.png
2015-07-07 07:07 Seth Nygard Note Added: 0028875
2015-07-08 01:13 Ange Menges Note Edited: 0028872 View Revisions
2017-07-01 13:56 slow putzo Note Added: 0032108
2017-07-01 14:08 aiaustin Mono Version 2.10 => None
2017-07-01 14:08 aiaustin Note Added: 0032110
2017-07-01 14:08 aiaustin Status new => resolved
2017-07-01 14:08 aiaustin Fixed in Version => master (dev code)
2017-07-01 14:08 aiaustin Resolution open => fixed
2017-07-01 14:08 aiaustin Assigned To => aiaustin
2017-07-01 14:08 aiaustin Status resolved => closed
2017-07-01 16:52 UbitUmarov Note Added: 0032113


Copyright © 2000 - 2012 MantisBT Group
Powered by Mantis Bugtracker