MantisBT - opensim
View Issue Details
0008315opensim[GRID] Robust Serverpublic2018-04-14 06:072018-08-15 09:08
Grid (1 Region per Sim) , Grid (Multiple Regions per Sim)
0008315: Regions table clearing
At the moment simulators that are taken offline via crash or process termination rather than the shutdown command leave behind an entry in the regions table. This results in regions seemingly connected to the grid not allowing entry, skew the number of connected regions and take up grid coordinates.

It would be useful to have robust periodically check all simulators in the regions table and send a "ping" to those simulators requiring a return to retain the entry in the regions table. If no return is received by robust in a certain amount of time the entry in the table would then be removed.

Obviously for grids that allow for the reservation of grid coordinates and rely on de-registration not to occur to retain said coordinates robust would need a configuration setting to turn this on or off. It would likely be a good idea to also add a field in the regions table, exempting an entry from checking for "keep-alive" by robust.
No tags attached.
Issue History
2018-04-14 06:07tampaNew Issue
2018-04-14 10:16UbitUmarovNote Added: 0032640
2018-04-14 12:05tampaNote Added: 0032642
2018-08-14 13:56Sheera KhanNote Added: 0032861
2018-08-15 09:08Fly-Man-Note Added: 0032862

2018-04-14 10:16   
Yes robust awereness oof regions and also avatars needs improvement.
But how to do that without a major increase on bandwith on large grids mb not that easy. The optional datasnapshot is a example of that, only sustainable on small grids.
2018-04-14 12:05   
Apart from logins robust is fairly lightweight in terms of bandwidth and generally most larger grids tend to run load-balanced as is. It's not going to be fun yes, but as it is it's not good either. You win some you lose some, in my opinion a hit worth taking if it means less ugly grid map.
Sheera Khan   
2018-08-14 13:56   
Metropolis is "pinging" all regions in the grid every 12(?) hours and collects some stats displayed in the region management tool of said grid. No degradation of performance and no serious increase in bandwidth usage is observed.
2018-08-15 09:08   
The problem with this lays in the fact if it's a closed or an open grid. Within closed grids people know if a region is up or not.

For example, I use the Uptimerobot to check if a region port is actually alive, if not I get a nice email that the region is failing so I can restart it.

With self-hosted and open grids that problem is largely on the part of the region owner itself. Having Robust or even the DataSnapshot daemon do those checks might be nice but perhaps a PING function build-in would be a better solution