[Opensim-dev] Designing with Instrumentation in mind.
orion hax
orion.hax at gmail.com
Sat Nov 28 02:26:11 UTC 2009
+1 for me. From the perspective of someone getting into hosting regions
haveing this kind of information is critical to do proper load balancing and
haveing a good idea the health of all the servers.
On Fri, Nov 27, 2009 at 8:21 PM, Kyle <create at reactiongrid.com> wrote:
> Implementation is over my head but as a former hardware technician & old
> enough to have troubleshot mainframes by tape reels and blinking red lights
> I lived by front panel indicators to help me quickly isolate issues and
> train operators on what trouble indicators to look for to catch issues
> before they caused downtime.
>
> So to me this is a must have to help diagnose and improve stability and
> uptime-Brilliant!....+1
>
> Kyle G
>
> -----Original Message-----
> From: opensim-dev-bounces at lists.berlios.de
> [mailto:opensim-dev-bounces at lists.berlios.de] On Behalf Of Teravus Ovares
> Sent: Friday, November 27, 2009 9:10 PM
> To: opensim-dev at lists.berlios.de
> Subject: [Opensim-dev] Designing with Instrumentation in mind.
>
> Hey there,
>
> A while back, we had somewhat reasonable statistics being generated
> and presented to the client. They were not always accurate, but
> based on what I saw, I could, pretty much pin certain parts of the
> simulator as the limiting factor during load tests. I'd say, the
> number 1 reason that they were semi-accurate and not accurate.. in
> the past.. is because nobody ever thought about instrumentation
> during the functionality design. It was always 'tacked on later'.
> One example of this.. is the current AssetCache implementation.
> There's no way, currently, to know, at a glance.. how many
> external requests it has open. Additionally, it will be extremely
> difficult to put one in because of the way the objects are designed
> and accessed. To put one in, an event needs to be added to the
> IAssetService interface and each AssetCache implementation will need
> an interlocked int to count how many gets and puts it currently has
> open to the external data source as well as it's own event calling
> schedule. Then, the IAssetService property in Scene, (AssetService)
> will need an event handler.. which updates the values in
> SimStatsReporter in Scene (StatsReporter). This idea of external
> access resource instrumentation should really have been built in to
> the design of the AssetService.
>
> This last recent load test, there were no real statistics that I could
> use to determine what the limiting factor was.
> Time Dilation was pegged at 1.0.. even when the simulator was
> obviously struggling. Total Frame time (MS) was -50ms even when the
> simulation MS was 850ms and the Physics ms was 250ms, so the
> inconsistencies made it impossible to know what part of the simulator
> was struggling. Agent Updates were erratic.. sometimes high..
> sometimes low when the simulator was fine and when it was struggling.
> Pending Uploads and Downloads were always 0, so there was no way to
> know how well the simulator was downloading and uploading assets to
> and from the grid. Packet stats were non-existant, so there was no
> way to know how well the UDP handlers were faring under the load.
> When it crashed, it crashed with a mono based stack trace which
> pointed to out of memory errors, so the only way that you could,
> scientifically, find out what the issue is.. is to run a load test
> under a memory profiler. We know, that running a public load test
> under a memory profiler is quite impractical.
>
> To make something better, I need to know two things, where it is, and
> where I want it to be. How can we make OpenSimulator better if we
> don't have statistics that point to where we are currently?
>
> On that note, I propose that, when designing objects for functionality
> in OpenSimulator, that we also consider if the objects should be
> instrumented and, what would be the best way to go about instrumenting
> the objects. We should incorporate instrumentation into the design of
> the objects. Some of that instrumentation is appropriate for a
> client to see, some of it might not be. Consider that, many of them
> should be client facing and be included in the SimStats that get sent
> to the client.. so that we can have a reasonable idea of what's
> going on with a simulator at a glance. Also, in the design of the
> instrumentation, we make sure that the instrumentation is accurate and
> lightweight.
>
> The load test went reasonably... but, we didn't get half of the
> information on the simulator that we needed to be able to improve it.
>
>
> Please comment :) I look forward to hearing your responses.
>
> Regards
>
> Teravus
> _______________________________________________
> Opensim-dev mailing list
> Opensim-dev at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/opensim-dev
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.709 / Virus Database: 270.14.79/2522 - Release Date: 11/27/09
> 14:39:00
>
>
> _______________________________________________
> Opensim-dev mailing list
> Opensim-dev at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/opensim-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://opensimulator.org/pipermail/opensim-dev/attachments/20091127/91d7ce20/attachment-0001.html>
More information about the Opensim-dev
mailing list