Thanks Marius!<br><br>For a very insightful and informative exposition in response to my post. I eagerly await your forthcoming detailed post in this regard.<br><br>To be honest, I had not sought so much to address the garbage collection problem with large pages - it was mentioned to me in discussion with others that someone had obtained increases in region server performance on linux by 'compiling mono with large page support'.<br>

<br>Based on this shred of information, I set out to determine what we could do to address certain stability problems on linux. I was unaware that similar issues had been seen on other *nix flavors than linux. It is unfortunate, but we have very little information on the issue.<br>

<br>Perhaps this is a sign of growing stability in other areas of the application space.<br><br>In the process of bringing into focus the mention of large pages through some web research this morning, I saw where we might make some gains in performance on linux by utilising large pages; however, there were still parts of the puzzle missing (part of the research shows that certain memory allocation methods took advantage of large pages and others did not). This would seem to indicate that the application, and I assume the application would be the mono VM, would need to do something special to take advantage of large page support. This brings us up to the point where the research became unproductive, as I was unable to determine precisely how this might be accomplished.<br>

<br>In light of what you've said in your post, I suspect that a very busy region with a lot of asset overhead would benefit from both large page usage and <br>such optimizations you mention; object reuse springs directly into mind.<br>

<br>I am growing confident that we are drawing down on this issue, and will bring solutions to light in the near term.<br><br>Cheers!<br>James<br><br><div class="gmail_quote">On Sun, Jun 15, 2008 at 6:52 PM, Mariusz Nowostawski <<a href="mailto:mariusz@nowostawski.org">mariusz@nowostawski.org</a>> wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi all,<br>

<br>

I am happy that issues related to performance, efficiency and stability<br>

of OpenSim on various platforms are getting more attention recently and<br>

that many folks are looking into it.<br>

<br>

Together with 3Di (Japan) we are looking into those issues too, and will<br>

be more than happy to share our results and solutions in due time. We<br>

are putting efforts to put together a summary of our findings online and<br>

make it possible to monitor certain aspects of OpenSim performance and<br>

memory management into our nightly build system, so it will be easier<br>

for the community to monitor what is working where, on which platforms<br>

and with what results. I'll keep you updated on that.<br>

<br>

As for the large page support: this is very architecture specific issue:<br>

different intel, amd and sparc MMUs support different page sizes, and<br>

then various OSes again can be compiled with a specific page size<br>

support - so there is no single solution for every platform/OS<br>

combination. Most common denominator seems to be the use of 4k and 8k<br>

and most configurations of architecture/OS use those page sizes by<br>

default. Solaris kernel can support multiple page sizes, although it is<br>

a bit more tricky than it sounds, and for example on SPARC not<br>

everything can be easily negotiated with the kernel. It all depends on<br>

which architecture a given OS runs.<br>

<br>

Poor memory management and large memory footprint is not going to be<br>

solved by recompiling something with large page support - to the<br>

contrary - the footprint and memory usage would most likely got much<br>

worse then. Large page support is usually for apps that do their own<br>

memory management, and it is usually increasing the overall memory<br>

footprint and improving the performance. To really benefit from large<br>

pages the software must take exclusive care of memory management itself.<br>

And in case of OpenSim it is a bit tricky. Let me explain - normally -<br>

memory management (say, when programming in C) is difficult - one needs<br>

to make lots of decisions and trade-offs between 3 general issues: a)<br>

memory footprint b) performance c) maintainability.  To really make<br>

things fast and small one needs to re-implement the memory management<br>

herself, and take care of things - this will put strain on the<br>

maintainability but will keep both, memory footprint and performance in<br>

their best. What usually happens is that one uses standard libc to keep<br>

the maintanability high, and makes tradeoffs - to be fast and big, or<br>

slow and small.<br>

<br>

With software running on VMs, there are much more levels to be<br>

considered when managing memory:<br>

a) application level<br>

b) VM level<br>

c) OS level<br>

In case of OpenSim, there is the following to consider:<br>

<br>

1. object instances allocations, de-allocations, arrays and collection<br>

management, hashes, large memory management, database queries etc. On<br>

the application level lots of good things can be done by "normal" C#<br>

programmers.  This can dramatically boost performance and reduce the<br>

memory footprint. For example - re-implementing some of the collection<br>

classes usually renders good results. Reducing the number of new object<br>

creation, and "recycling" the objects inside the application instead of<br>

creating new instances and letting the system to garbage collect unused<br>

instances - this also can dramatically improve both, performance and<br>

memory footprint. And so on - good programming practices, taking care of<br>

memory usage and memory management can make things really better -<br>

especially on systems running on VMs. Even little things like boxing,<br>

and efficient use of native data types - this all contribute.<br>

<br>

2. VM-level (being it Mono or any VM) - performance here can be tweaked<br>

by many parameters, but, the biggest contributor is garbage collector<br>

itself. Different garbage collectors have different ways of managing<br>

memory, and these can substantially change the way applications behave.<br>

 From our limited experiences to date, mono with different GCs behaves<br>

completely different - stability, performance and footprint are all<br>

highly sensitive to the GC used. We are getting quite good results when<br>

using the latest Boehm GC - but things can be tweaked even better.<br>

<br>

3. OS - what I mean here is:<br>

- the memory management left out and not handled by the APP itself or VM<br>

including large page support,<br>

- I/O,<br>

- threads management,<br>

- IPC (especially shared memory),<br>

- and networking.<br>

These all can be tweaked. Normally these are designed to be generic and<br>

handle wide range of cases and apps. In case of OpenSim alone, things<br>

can be improved and tweaked for a particular purpose alone.<br>

<br>

This is all pretty complex. There is no silver bullet that will<br>

magically make OpenSim run faster with small memory footprint. But -<br>

there are many areas improvement can be made, and it will be desirable<br>

to have a more targeted efforts towards that - Over here we are trying<br>

to draw a roadmap of all those various aspects, and I am grateful for<br>

good discussion and contribution from many people that put things in<br>

perspective.<br>

<br>

For any of you doing any testing - please take a note on the exact<br>

kernel, mono version, GC used and post these together with your<br>

results/observations. This will help replicating some scenarios and<br>

digging into causes of various behaviours. For one thing, we were unable<br>

to replicate most of problems with Mono on our own Linux setups - as for<br>

Solaris on SPARC, these are highly sensitive to exact version and GC<br>

used - we have cases of complete mono crashes, to the system running<br>

well, subject to various tweaking of compile parameters. As said<br>

earlier, we want to put a report together, to gather all these in a<br>

single place, so others can compare it to what is observed and so on.<br>

I'll keep you posted on that,<br>

<br>

--<br>

cheers<br>

Mariusz<br>

<div><div></div><div class="Wj3C7c"><br>

<br>

<br>

<br>

<br>

James Stallings II wrote:<br>

> Greetings,<br>

><br>

> Included below is a transcript of a recent sunday morning discussion in re:<br>

> the mono/large pages stuff that's recently appeared on the radar.<br>

><br>

> as you will see, it is really more of a kernel-tweaking issue, although the<br>

> application does come into play in the way it requests memory. For our<br>

> purposes, 'application' in that last sentence is mono, not opensim.<br>

><br>

> Hope this provides some insights :)<br>

><br>

> Cheers<br>

> daTwitch<br>

><br>

> Oh, still researching how to take advantage of this end-to-end wrt our<br>

> application. Will update as I uncover more information.<br>

><br>

><br>

><br>

> <daTwitch> this is somewhat relevant:<br>

> <a href="http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6664521" target="_blank">http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6664521</a><br>

> <daTwitch> although I finf the placating sycophantic tone of the bug<br>

> submitter makes me want to find him an emotional support group<br>

> <nebadon> lol<br>

> <daTwitch> the universe has surely reversed it's polarity; computer science<br>

> (which is where I learned the term "egoless programming") is now saturated<br>

> with sensitivity; and Fine Arts, once considered the most subjective subject<br>

> under quanititative and qualititative analysis, is consumed with issues<br>

> relating to process, review, and open, formal  critique<br>

> <daTwitch> aat least, it was where I went to school lols<br>

> <daTwitch> this is also relevant, if somewhat more out of date.<br>

> Unfortunately, this looks almost identical to the things we're seeing, and<br>

> given the age of the issue, and that we're still seeing it now, doesnt give<br>

> me a lot of hope for getting the mono folk to take the problem on.<br>

> <daTwitch><br>

> <a href="http://lists.ximian.com/pipermail/mono-list/2006-April/031312.html" target="_blank">http://lists.ximian.com/pipermail/mono-list/2006-April/031312.html</a><br>

> <daTwitch> although it is encouraging that the OpenSolaris folk claim to<br>

> have fixed the problem with a patch to their O/S<br>

> <daTwitch> maybe someone should investigate how this performs under<br>

> opensolaris<br>

> <daTwitch> The discussion of TLBs (translation buffers, which are crucial to<br>

> page addressing in these memory models), in this article:<br>

> <a href="http://lwn.net/Articles/173882/" target="_blank">http://lwn.net/Articles/173882/</a> suggests that some kernel optimizations on<br>

> the server hardware in question can significantly improve the performance of<br>

> memory accesses in general for a given program - if I read it right, it<br>

> would indicate that we would need to build the correct optimizations into<br>

> the k<br>

> <daTwitch> ernel, then compile mono locally and link it as described<br>

> <daTwitch> however, it may be that these effects would only be significant<br>

> on 64bit O/S<br>

> <daTwitch> that's about all I'm turning up of any significane<br>

> <daTwitch> *significance<br>

> <nebadon> hmm<br>

> <nebadon> do you recall anything about compiling mono with --large page<br>

> <nebadon> or large pages<br>

> <nebadon> something like that<br>

> <nebadon> someone was talking about it on -dev a while back<br>

> <nebadon> they said it helped with memory stuff with mono<br>

> <nebadon> i looked yesterday but couldnt find anything<br>

> <nebadon> it wasnt  one of the regulars  on -dev channel though<br>

> <daTwitch> that's what all the foregoing stuff is about<br>

> <nebadon> they claimed it really helped alot<br>

> <ckrinke> I dont see --large, but<br>

</div></div>> <a href="http://www.mono-project.com/Compiling_Monohas" target="_blank">http://www.mono-project.com/Compiling_Monohas</a> mention of a special Xen<br>

<div><div></div><div class="Wj3C7c">> switch.<br>

> <nebadon> at the time i was less interetsed in the topic though<br>

> <nebadon> hmm<br>

> <daTwitch> we were discussing it with JustinCC at the office hours y/d too<br>

> <nebadon> yea<br>

> <nebadon> i brought it up then<br>

> <nebadon> i looked into it after the meeting<br>

> <nebadon> and couldnt find anything<br>

> <daTwitch> basically it comes down to this: the windows kernel allocates<br>

> memory far differently than a unix kernel<br>

> <daTwitch> and c#, as a result of being native to the platform, can take<br>

> advantage of that to compress data as it does garbage collection<br>

> <daTwitch> mono doesn't even try<br>

> <nebadon><br>

> <a href="http://developer.amd.com/documentation/Articles/Pages/322006145.aspx" target="_blank">http://developer.amd.com/documentation/Articles/Pages/322006145.aspx</a><br>

> <daTwitch> compress is the term used, but is not technically correct<br>

> <nebadon> heres talk about its use in Java<br>

> <daTwitch> imagine your large page as a hard disk sector in need of<br>

> defragging<br>

> <daTwitch> in fact, that is an incredibly accurate metaphor<br>

> <daTwitch> windows defragments the data in memory<br>

> <daTwitch> mono doesnt<br>

> <nebadon> yea<br>

> <nebadon> i recall them saying that mono<br>

> <daTwitch> for the same reasons as a hard disk defrag and wit hsimilar<br>

> benefits<br>

> <nebadon> wastes the space  if because it requires more blocks that needed<br>

> or something<br>

> <nebadon> and lots of memory is wasted<br>

> <daTwitch> yes<br>

> <nebadon> unless large  pages is specified<br>

> <daTwitch> precisely<br>

> <daTwitch> ok, so we are long overdue making a mono with large pages then -<br>

> would that be a valid assertion?<br>

> <nebadon> yea<br>

> <nebadon> id like it see it tested<br>

> <nebadon> if we can figure out how<br>

> <daTwitch> I'm sooooo on it<br>

> <nebadon> sweet<br>

> <daTwitch> I can build any thing<br>

> <nebadon> great<br>

> <daTwitch> as long as I have enough ram<br>

> <nebadon> i think it will be a big help to see where it takes us<br>

> <daTwitch> ok, I'll be busy for a bit<br>

> <nebadon> k<br>

> <nebadon> thanks man<br>

> <daTwitch> I'll keep y'all posted<br>

> <nebadon> great<br>

> <ckrinke> maybe its a ./configure option and is something like<br>

> --memory=large<br>

> <daTwitch> quite possibly<br>

> <nebadon> yea its something like that<br>

> <nebadon> i wish i took notes<br>

> <nebadon> but like i said at the time<br>

> <nebadon> i was less interested<br>

> <ckrinke> do the mono guys have an irc channel on FreeNode?<br>

> <daTwitch> no idea<br>

> <daTwitch> pulling source now<br>

> <daTwitch> will see if I can locate their IRC channel<br>

> <nebadon> cool<br>

> <daTwitch> gimpnet servers only at <a href="http://irc.gnome.org" target="_blank">irc.gnome.org</a> and <a href="http://irc.gimp.net" target="_blank">irc.gimp.net</a><br>

> <daTwitch> #mono<br>

> <daTwitch> #monodev<br>

> <daTwitch> #mono-winforms<br>

> <daTwitch> #monodevelop<br>

> <daTwitch> #cocoa<br>

> <daTwitch> #mono-hispano<br>

> <daTwitch> #monouml<br>

> <daTwitch> #gendarme<br>

> <daTwitch> #mono-ally<br>

> <daTwitch> #moonlight<br>

> <daTwitch> moonlight == silverlight for mono<br>

> <nebadon> nice<br>

> <daTwitch> ok source is down, back to work<br>

> <Ter_Afk> moonlight == loonmight?<br>

> <daTwitch> heh<br>

> <daTwitch> I dont even know what silverlight is, but I've heard discussion<br>

> of it, so it was a point of interest<br>

> <Ter_Afk> Microsoft's answer to Adobe Flash<br>

> <daTwitch> ok, no mention whatsoever of a --large-pages option to the<br>

> configuration<br>

> <daTwitch> we have --large-heap<br>

> <daTwitch> large_code<br>

> <Ter_Afk> large_fire?<br>

> <Ter_Afk> k, nuf with the word jokes.<br>

> <daTwitch> does anyone know if it was large-pages, or large_pages?<br>

> <nebadon> i dont recall<br>

> <nebadon> i just remember the term large  pages being used some how<br>

> <daTwitch> lol googling large pages turns up everything from beano to kirk<br>

> douglas<br>

> <nebadon> lol<br>

> <nebadon> yea<br>

> <nebadon> i had no luck on google<br>

> <nebadon> nor the mono website<br>

> <daTwitch> actually, I'm starting to think large_pages refers to a kernel<br>

> setting<br>

> <nebadon> well they said Compile Mono from source<br>

> <nebadon> with the large pages switch<br>

> <nebadon> i do remember that<br>

> <nebadon> its probably related more to the compiler<br>

> <nebadon> than mono<br>

> <nebadon> so maybe we are looking in the wrong  places<br>

> <daTwitch> hmmm<br>

> <daTwitch> that's a clue<br>

> <daTwitch> ok, I got configure to execute to completion very cleanly<br>

> <daTwitch> gotta take 5 tho<br>

> <daTwitch> bbiaf<br>

> <nebadon> ok<br>

> <daTwitch> ah needs mah gcc 4.2 doc<br>

> <daTwitch> The Virtual Memory (VM) Subsystem<br>

> <daTwitch> Most modern computer architectures support more than one memory<br>

> page size. To illustrate, the IA-32 architecture supports either 4KB or 4MB<br>

> pages. The 2.4 Linux kernel used to only utilize large pages for mapping the<br>

> kernel image. In general, large page usage is primarily intended to provide<br>

> performance improvements for high performance computing applications, as<br>

> well as database applications that have large working sets. A<br>

> <daTwitch> ny memory access intensive application that utilizes large<br>

> amounts of virtual memory may obtain performance improvements by using large<br>

> pages. Linux 2.6 can utilize 2MB or 4MB large pages, AIX uses 16MB large<br>

> pages, whereas Solaris large pages are 4MB in size. The large page<br>

> performance improvements are attributable to reduced translation lookaside<br>

> buffer (TLB) misses. Large pages further improve the process of memory prefe<br>

> <daTwitch> tching, by eliminating the necessity to restart prefetch<br>

> operations on 4KB boundaries.<br>

> <daTwitch> from: <a href="http://aplawrence.com/Linux/linux26_features.html" target="_blank">http://aplawrence.com/Linux/linux26_features.html</a><br>

> <daTwitch> it's a feature that must have support in the kernel, at the very<br>

> least<br>

> <daTwitch> though I can find neither build-time nor runtime configuration<br>

> points that take advantage of it in either gcc nor mono at this point<br>

> <nebadon> hmm<br>

> <daTwitch> still looking though ;)<br>

> <nebadon> sounds like the problem though<br>

> <daTwitch> yes, think we are in the process of pinning it down<br>

> <nebadon> nice<br>

> <daTwitch> even if we arent doing things to precisely duplicate how things<br>

> go under c#, this should yield a performancve gain that compensates<br>

> <daTwitch> I keep seeing the figure 10%<br>

> <nebadon> yea thats a good start<br>

> <daTwitch> that is significant when we consider how much we pay in memory<br>

> per-av<br>

> <daTwitch> here is some additional good background info, but still does not<br>

> complete the picture:<br>

> <a href="http://findarticles.com/p/articles/mi_m0ISJ/is_2_44/ai_n14793331/pg_10" target="_blank">http://findarticles.com/p/articles/mi_m0ISJ/is_2_44/ai_n14793331/pg_10</a><br>

> <daTwitch> mysql can also benefit heavily from the use of large pages<br>

> <daTwitch> combining the benefits of mysql on large pages with our various<br>

> servers on large pages (actually, the UGAIM could possibly take a<br>

> performance *hit* from large pages) might yield even greater than 10%<br>

> performance increase<br>

> <daTwitch> probably the large pages switch to start with is a kernel<br>

> boot-time config point<br>

> <nebadon> nice<br>

> <nebadon> i would think though a program like mysql would already be<br>

> compiled to such a thing<br>

> <daTwitch> well, no, not necesarily<br>

> <nebadon> so the goal i assume<br>

> <nebadon> is 4mb page size?<br>

> <nebadon> vs 4k<br>

> <daTwitch> the underlying kernel has to be configured to support it, and if<br>

> the application isn't sufficiently demanding, it actually will take a<br>

> performance hit<br>

> <daTwitch> yes<br>

> <daTwitch> 4mI think 16mb is also supported in 2.6+ kernels, but I doubt we<br>

> need it yet<br>

> <nebadon> yea it sounds to me like any kernel thats 2.6 its already enabled?<br>

> <nebadon> but the app needs to be told to use it?<br>

> <nebadon> its amazing how useless google is for this topic<br>

> <nebadon> hehe<br>

> <daTwitch> well, it's a bit obscure, unless you know what you're looking for<br>

> <daTwitch> this is really about kernel tweaking, not so much mono<br>

> <daTwitch> the kernel needs to be told to support it at boot time - perhaps<br>

> even needs to be compiled for it<br>

> <daTwitch> but the support is in the source<br>

> <daTwitch> plus, not too many folks need to do this<br>

> <daTwitch> only high perf types with really demanding software<br>

> <daTwitch> (that would be us lols)<br>

> <daTwitch> the app does have to be told to utilise it somehow though<br>

> <nebadon> yea<br>

> <nebadon> opensim is definatly more demanding than say apache<br>

><br>

><br>

</div></div>> ------------------------------------------------------------------------<br>

><br>

> _______________________________________________<br>

> Opensim-dev mailing list<br>

> <a href="mailto:Opensim-dev@lists.berlios.de">Opensim-dev@lists.berlios.de</a><br>

> <a href="https://lists.berlios.de/mailman/listinfo/opensim-dev" target="_blank">https://lists.berlios.de/mailman/listinfo/opensim-dev</a><br>

<br>

_______________________________________________<br>

Opensim-dev mailing list<br>

<a href="mailto:Opensim-dev@lists.berlios.de">Opensim-dev@lists.berlios.de</a><br>

<a href="https://lists.berlios.de/mailman/listinfo/opensim-dev" target="_blank">https://lists.berlios.de/mailman/listinfo/opensim-dev</a><br>

</blockquote></div><br><br clear="all"><br>-- <br>===================================<br>The wind<br>scours the earth for prayers<br>The night obscures them