Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0008771opensim[GRID] Hypergridpublic2020-09-12 12:062020-09-25 10:29
ReporterJeffKelley 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
StatusnewResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0008771: Map cannot find a grid on the same host
DescriptionThe issue has been discovered on a machine hosting two grids on port 6002 and 7002, both running master. It has been dismissed as a configuration error we were never able to find.

It has been recently confirmed on a machine hosting a grid and four standalones (8002, 9000, 9100, 9200, 9300) when upgrading from release 0.9.0.0 to 0.9.1.1. Since that was working in 0.9.0, this points to a real issue rather than a configuration error.

When searching for a grid without a region name (e.g. we are on domain:6002 and search for domain:7002) the avatar is teleported to the default region of the grid it is currently on (6002). That actually "maps" all the grid instances to the originating grid.

When searching for a grid with a region name (e.g. we are on domain:6002 and search for domain:7002:Welcome), if a region of same name exists on the originating grid, the avatar is teleported there. If there is no region of this name, the map issues the message "No region found with this name".

Schematically :

logged on domain:6002
searching for domain:7002
teleported to domain:6002 default region
ROBUST 6002 says [GRID SERVICE]: GetDefaultRegions returning 1 regions
ROBUST 7002 ays nothing

logged on domain:6002
searching for domain:7002:Showcase
  No region found with this name

logged on domain:6002
searching for domain:7002:Ecole (Ecole does not exist on 7002 but does on 6002)
teleported to region Ecole of domain:6002

That also affects osTeleportAgent.

Teleport Home is not affected.
Teleport via landmarks is not affected.
Teleport outside host is not affected.

Both machines run Linux with a loopback-enabled router. From a terminal, we cas curl host:6002 and host:7002 by external name and get an answer. This is unlikely a network issue. No request to link is done to the remote grid.
TagsNo tags attached.
Git Revision or version number0.9.1.1 and master
Run Mode Grid (Multiple Regions per Sim)
Physics EngineBulletSim
Script EngineXEngine
EnvironmentMono / Linux64
Mono Version6.x
ViewerFirestorm
Attached Files

- Relationships

-  Notes
(0036835)
tampa (reporter)
2020-09-12 12:47

You should setup subdomains rather than ports
(0036836)
UbitUmarov (administrator)
2020-09-12 12:58

a grid is identified by a host name not hostname:port pair
so yes, it is the same grid
(0036837)
JeffKelley (reporter)
2020-09-12 13:23

This setup has been working for more than 5 years and provided a useful test environment to boot-up standalones on demand. I never had issues teleporting inside the same host.

HG addresses are host:port, serverURI in database are also host:port, as are ServiceURLs.

I never heard that running multiple grids (or SA) on the same machine was not possible. If it is now, it means you have to use proxying or virtualizing to do the same thing. What a simplicity!

Discarding the port and routing a linking request to our own instance is beyond my understanding.
(0036838)
tampa (reporter)
2020-09-12 13:31

You can easily setup subdomains for each and have them route to different port via something like nginx, just look up reverse proxy and how to set that up. That's all you need to run multiple things on one machine. If that's beyond your skill level may I suggest perhaps looking for someone to help you out. This isn't really an OpenSim issue more a configuration and setup one and mantis is not a user support forum :)
(0036839)
UbitUmarov (administrator)
2020-09-12 13:38

"grids" will have more than one port
like 80 and 443 for example
and in that case you should not need any port even, just http or https
(0036844)
JeffKelley (reporter)
2020-09-12 14:02

> "grids" will have more than one port

Sure. Mine has 7002, 7100 to 7109 for simulators, 7020 to 7029 for regions of grid 1, and +/- same with 6-* prefix for grid 2. They never mixed, since ports are distinct entities. Only when requesting a HG link from grid1 to grid2, they forget the port.

Certainly, I can setup a reverse proxy if required, and NOT beg for help here.

I opened this issue because I though this ability has been lost en route. Correct me if I am wrong. I'm ready to restore the databases to .9.0 if you ask me for a test.
(0036845)
UbitUmarov (administrator)
2020-09-12 14:05

i meant the gatekeeper/login service
(0036846)
tampa (reporter)
2020-09-12 14:14

The fact some "links" remain on map are likely to do with some form of cache or things like teleport history etc. The setup is quite complex and the changes to teleports over the months made this more complex. It certainly was never intended to work and with how the viewer tends to handle things, if it really did try to resolve the teleports properly it would break on such a setup.

It is good practice to separate systems by things like subdomains rather than ports as the latter can be harder to remember and be more confusing to deal with. grida.domain.my gridb.domain.my I find easier to deal with than remembering the port I set the login service at. That's probably why many grids now opt to use port 80 for this as well.

I merely suggested that if you needed help with that setup that there are people, myself included, happy to help with those things. :)
(0036847)
JeffKelley (reporter)
2020-09-12 14:16

> i meant the gatekeeper/login service

Yes. They are [Hypergrid]GatekeeperURI and [GridInfoService]login, all configurable via inis, and you can only mix the by mistake. This is unlikely to happen thanks to the var substitution from the [Const] section.
(0036848)
JeffKelley (reporter)
2020-09-12 14:21

> The fact some "links" remain on map are likely to do with some form of cache or things like teleport history

True. This is when it is time to quit Firestorm to clear the mess. Clearing the region from the 'Regions' table (SA) or the simulator memory (grid) may help. This happens also for grids on different hosts. Changing domains, uuids, region names is the better recipe for a disaster. Lots get persisted.

I appreciate your offer for help, Tampa. But I think i can manage it.
(0036884)
JeffKelley (reporter)
2020-09-20 12:38

To add some flesh to this issue, i have booted two .9.0 standalones:

grid.pescadoo.net:9400
grid.pescadoo.net:9500

Both on the same host.

Anybody free to jump.

On the first is a red teleporter to the second.
On the second, a green teleporter to the first.

Teleports are ok using either osTeleportAgent or typing on the map.

You can also jump to grid.pescadoo.net:8002 (same host too) running .9.1, but no way back.

I so demonstrate that this ability has existed prior to .9.1.

It may not have been an explicit (documented) feature, but it was an implicit one and a reasonable expectation since no other operation is affected. Grids run on their own set of ports with no adverse interaction. That may be a bad choice for public grids. On the other hand, it is a precious capability for experimental setups running multiple grids or HG standalones.

There may exist a legitimate incentive to drop this capability. "Two grids cannot share a same host name" is not to be postulated without reason, when evidence shows it used not to be true. There should be an imperative reason (If we do this, terrible things will happen, it will cripple planned evolution, sky will fall, cat die, etc) Then, it can be regarded as a design decision. Otherwise, it is an artificial limitation for no reason.

The nuisance is minor (use landmarks inside the host). It is not worth setting up reverse proxy or virtualization. To create a landmark, jump on any grid outside the host then back. Et voila!
(0036888)
Ferd Frederix (reporter)
2020-09-21 23:58

Try 127.0.0.1:7002 and 127.0.0.1:8002. If now prevents two two grids from running, it is a regression.

Using subdomains does not sound like a solution as those resolve to the same IP addresses.
(0036890)
tampa (reporter)
2020-09-22 00:12

What it resolves to doesn't matter, OpenSim sees the domain and handles that accordingly similar to a reverse proxy setup for example.
(0036891)
JeffKelley (reporter)
2020-09-22 01:37
edited on: 2020-09-22 01:38

grid1.pescadoo.net:9600 (green cube)
grid2.pescadoo.net:9700 (red cube)

Working. No nginx, no reverse proxy.
0.9.2.0 Yeti Dev commit 64fea8f of 2020-09-12.

(0036893)
Ferd Frederix (reporter)
2020-09-22 12:21

@Tampa, 127.0.0.1 does not 'resolve' to anything except 127.0.0.1. There is no need to resolve and IP, but if you try, you get it right back. Usually.

In the situation Imentions,since there is no name there to resolve, Opensim Will use the IP for the name. It should work on it in all circumstances, just as it did work got Jeff no matter how many grids run on a give IP.

I have a test server at Contabo that all uses the same IP for 3 different grids. The only difference is ports.

Localhost is resolved to 127.0.0.1. Localhost is not always the same as 127.0.0.1. It can be resolved to anything you want.
(0036896)
tampa (reporter)
2020-09-23 01:34

The "capability" to separate via ports is based around the initial communication from one grid to another, but it does not constitute a valid circuit. It only works because the two systems aren't aware of one another.

Separation via local addresses, that you can assign similarly to normal domains means you can separate them without the need to even use different ports for the "visible" outside and instead you simply reverse proxy them to the system voila you can go grid1.local to grid2.local without a port.

This is explained on the wiki.

The fact it does work this way is probably just because there is not anything in there to force a valid circuit and instead, robust being robust, it just does what it is told to, but it should not be seen as a feature, rather a "if you must do it that way" or something.
(0036897)
JeffKelley (reporter)
2020-09-23 01:34
edited on: 2020-09-23 01:50

Since 0.9.1, map service stopped comparing the tuple (BaseHostname, PublicPort) to see if a request should be processed locally or forwarded to an external grid. Now, it handles all requests for BaseHostname as local and sends the avatar to a local region, if one exists with matching name, or default.

This may have been introduced with commit ae130d9f, as a fix for mantis 8580
https://github.com/opensim/opensim/commit/ae130d9f25de9d46038270337ddc967e0e8ab1d9 [^]
    mantis 8580: make some changes on regions find code. (only gatekeeper
    host is used on local grid detection, not its port)

It is therefore no longer possible to host different grids on the same domain name, but still possible to share a same machine using domain names resolving to the same IP (A record or CNAME).

Instead of my.domain.name:8002 and my.domain.name:7002, use my1.domain.name:8002 and my2.domain.name:7002. my.domain1.name:8002 and my.domain2.name:7002 also works, at the cost of a domain name (subdomains cost nothing).

This mitigates the issue.

As to know if it is a regression or an evolution. If it fixes a local/hg confusion in certain network configurations (two interface approach of MANTIS 8580), it cannot be labeled a regression. It is a change from the way we used to understand the hypergrid ("All BaseURL belongs to me").

Leaving the issue opened to further discussion.

Disposing the test grids since we know all about it now.

(0036898)
Ferd Frederix (reporter)
2020-09-23 15:59

This means Opensim must use a DNS server or two grids on one server will not work any more. That's clearly a regression.

If you try to run two grids on localhost, or 127.0.0.1, or 192.168.*, or 10.* it will not work. Likewise, if you run two grids on any single LAN IP they will not work. And running two grids on the same WAN IP will no longer work.

So we have to add a etc/hosts hack into the local DNS to do this. And tell people to always use a Domain name to run two grids when on the road.

What Wiki page do I edit to do this?
(0036899)
UbitUmarov (administrator)
2020-09-23 16:14
edited on: 2020-09-23 16:21

made some changes. Grid id may be hostname:port again.
it is already on some paths. Possible map works now.
But a lot more to change.

grid id is GatekeeperURI
as added, not long ago there is a comma sep, list GatekeeperURIAlias
this is for urls that represent the same grid, like after a dns change for some reason.
Note this is only for local grid detection, so regions don't try to do HG teleports to regions on same grid.


HomeURI may have similar, but not as relevant.

(0036900)
JeffKelley (reporter)
2020-09-23 22:58

Creating two standalones with commit 7979028 of 2020-09-24
grid.pescadoo.net:9800
grid.pescadoo.net:9900

Using Firestorm 6.4.5 beta

Teleporting back and forth with the map => ok.
Teleporting back and forth with osTeleportAgent => ok.

Old behaviour seems restored.
Thank a lot.

> But a lot more to change.

Staying ready for tests.
(0036901)
JeffKelley (reporter)
2020-09-25 10:29
edited on: 2020-09-27 10:27

Still commit 7979028

May be a side effect : Map search returns local and hg regions containing the query string in any part of the URL, i.e. I search for local region 'com(munity)', i get all HG regions containing 'com' in domain name (a lot).

EDIT: This is a change in viewer behaviour and has been observed on 9.1.1 grids.


- Issue History
Date Modified Username Field Change
2020-09-12 12:06 JeffKelley New Issue
2020-09-12 12:47 tampa Note Added: 0036835
2020-09-12 12:58 UbitUmarov Note Added: 0036836
2020-09-12 13:23 JeffKelley Note Added: 0036837
2020-09-12 13:31 tampa Note Added: 0036838
2020-09-12 13:38 UbitUmarov Note Added: 0036839
2020-09-12 14:02 JeffKelley Note Added: 0036844
2020-09-12 14:05 UbitUmarov Note Added: 0036845
2020-09-12 14:14 tampa Note Added: 0036846
2020-09-12 14:16 JeffKelley Note Added: 0036847
2020-09-12 14:21 JeffKelley Note Added: 0036848
2020-09-20 12:38 JeffKelley Note Added: 0036884
2020-09-21 23:58 Ferd Frederix Note Added: 0036888
2020-09-22 00:12 tampa Note Added: 0036890
2020-09-22 01:37 JeffKelley Note Added: 0036891
2020-09-22 01:38 JeffKelley Note Edited: 0036891 View Revisions
2020-09-22 12:21 Ferd Frederix Note Added: 0036893
2020-09-23 01:34 tampa Note Added: 0036896
2020-09-23 01:34 JeffKelley Note Added: 0036897
2020-09-23 01:50 JeffKelley Note Edited: 0036897 View Revisions
2020-09-23 15:59 Ferd Frederix Note Added: 0036898
2020-09-23 16:14 UbitUmarov Note Added: 0036899
2020-09-23 16:21 UbitUmarov Note Edited: 0036899 View Revisions
2020-09-23 22:58 JeffKelley Note Added: 0036900
2020-09-25 10:29 JeffKelley Note Added: 0036901
2020-09-25 11:31 JeffKelley Note Edited: 0036901 View Revisions
2020-09-27 10:27 JeffKelley Note Edited: 0036901 View Revisions


Copyright © 2000 - 2012 MantisBT Group
Powered by Mantis Bugtracker