[Opensim-dev] region status tracking in GridServer

Brian Wolfe brianw at terrabox.com
Sun Feb 10 17:53:54 UTC 2008


This is why i'm looking at leveraging the reciept of existing packets
from regions to grid server as a substitute to the heartbeat packet.

The grid server will update it's last_seen datetime in the regiondata
object. This way a busy or modestly used region won't ever send a
heartbeat to the grid server. But a region that's rarely used will send
a periodic heartbeat (admin configurable in interval length).


On Sun, 2008-02-10 at 09:10 +0100, Stefan Andersson wrote:
> I would not say this is 'fatally flawed' just because it can be the
> result of one clients connectivity problems; it's still an indication
> that somebody somewhere had troubles reching the target. That's why it
> should only be seen as a ping trigger, not as status authoritative.
>  
> Also, it's not the client that reports on the trouble, it's the source
> region - which means, that if the client can re-connect to the source
> region but not to the target, chances are that there's something wrong
> with the target.
> 
> If a grid server is the authority for say 6000 servers, frequent grid
> wide pings becomes quite the task, not to mention the fact that the
> grid server itself can (and in an heterogenous grid setting, will)
> experience connectivity problems, just as the client can.
>  
> Best,
> /Stefan
> 
>  
> 
> ______________________________________________________________________
> 
> > Subject: RE: [Opensim-dev] region status tracking in GridServer
> > From: brianw at terrabox.com
> > To: stefan at tribalmedia.se
> > CC: opensim-dev at lists.berlios.de
> > Date: Sat, 9 Feb 2008 18:22:45 -0600
> > 
> > Unfortunately this is fatally flawed since teleports can fail due to
> > internet routing issues taht affect only one person, amongst other
> > movement failures that would cause the client to think a region is
> > offline vs the server knowing it's offline or online.
> > 
> > I'm posting a patch to mantis that lays the foundation for regions
> beign
> > online or offline and when they were last heard from by the grid
> server.
> > 
> > 
> > 
> > On Sat, 2008-02-09 at 21:25 +0100, Stefan Andersson wrote:
> > > Ok, I was to do some research before I replied on this thread, but
> off
> > > the top of my head;
> > > 
> > > first of all, we should define what we want the region status data
> > > for; the data should guide choice of implementation.
> > > 
> > > That said, there is quite a distinct possibility that we can use
> the
> > > _clients_ as our agents for detecting offline region servers.
> > > 
> > > Whenever someone tries to teleport off a region, the source region
> is
> > > informed of this, to be able to downgrade avatar presence to
> child.
> > > 
> > > If the teleport fails (the target region is unresponsive) the
> > > connection comes back to the region so it should upgrade the
> avatar
> > > presence to root again.
> > > 
> > > (Incidentally, I've filed a mantis on this, since we actually
> don't
> > > handle that, and failed teleports results in you coming back to a
> > > region, but as a 'child' ie being stuck there and possibly causing
> all
> > > kind of havoc with the ClientView)
> > > 
> > > Now, if we handled the failed teleport correctly, this could also
> > > notify the grid service that a failed teleport has occurred.
> > > 
> > > The grid service could then ping the target region, to check on
> its
> > > state.
> > > 
> > > This is an alternative or complement to grid-wide ping sweeps; you
> > > probably want both, but could do the sweeps much less frequently.
> > > 
> > > Combine this with proper region signon/signoff and I say you got
> > > options.
> > > 
> > > Best,
> > > /Stefan
> > > 
> > > 
> > > 
> > >
> ______________________________________________________________________
> > > 
> > > > From: brianw at terrabox.com
> > > > To: opensim-dev at lists.berlios.de
> > > > Date: Sat, 9 Feb 2008 12:56:10 -0600
> > > > Subject: [Opensim-dev] region status tracking in GridServer
> > > > 
> > > > After much IRC discussion I would like to make a couple changes
> to
> > > the
> > > > regions table.
> > > > 
> > > > online bool NOT NULL deafult false
> > > > last_seen int(11) NULL
> > > > 
> > > > The online column would be updated as a region logs in and out
> of
> > > > GridServers. This way an external management/status application
> > > doesn't
> > > > have to pester the grid server for the full list of regions and
> > > their
> > > > status. This would also provide data to regions requesting map
> > > blocks as
> > > > to the status of a region (a.ka. LL's Red Regions in the map
> view).
> > > > 
> > > > The last_seen column would be of asistance to these same
> management
> > > apps
> > > > in helping the administrator to determine which regions were
> long
> > > term
> > > > MIA or just plain not wanted anymore by walk away grid members.
> > > > 
> > > > My main concern is load placed on the grid server and having to
> ping
> > > > regions by external applications having to probe every region
> > > currently
> > > > to tell if it's still around or not.
> > > > 
> > > > To acomplish this i'm workign on a patch to implement
> region_logout
> > > > XMLRPC that would be called via
> > > Region.Communications.DeregisterRegion
> > > > (which is currently empty).
> > > > 
> > > > I woudl also add an update of RegionProfileData as each RPC call
> si
> > > made
> > > > by a region to GridManager. Database updates would only happen
> > > > periodicly as regions login and logout.
> > > > 
> > > > Are there any objections or reasons not to implement this?
> > > > 
> > > > _______________________________________________
> > > > Opensim-dev mailing list
> > > > Opensim-dev at lists.berlios.de
> > > > https://lists.berlios.de/mailman/listinfo/opensim-dev
> > > 
> > 
> 




More information about the Opensim-dev mailing list