Anonymous | Login | Signup for a new account | 2021-01-26 05:01 PST | ![]() |
Main | My View | View Issues | Change Log | Roadmap | Summary | My Account |
View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | |||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | |||||
0007392 | opensim | [GRID] Grid Service | public | 2014-12-09 09:06 | 2017-01-05 12:27 | |||||
Reporter | aiaustin | |||||||||
Assigned To | melanie | |||||||||
Priority | normal | Severity | feature | Reproducibility | N/A | |||||
Status | assigned | Resolution | reopened | |||||||
Platform | PC | Operating System | Windows | Operating System Version | 8.1 | |||||
Product Version | master (dev code) | |||||||||
Target Version | Fixed in Version | |||||||||
Summary | 0007392: Suggestion to serve robots.txt from grid/standalone base URL | |||||||||
Description | Search engines and other probes frequently attempt to look up URLs on OpenSim grids using URLs they find on web pages, in bug reports online, etc. These often are URLs with avatar and asset UUIDs that are not meant for external content serving but hooks used by OpenSim system. They fail with red errors in the consoles for Robust.exe and OpenSim.exe, and have been known to hang the system in some cases I have not fully resolved. If the base URL with, e.g. http://grid.com:8002/robots.txt [^] could be served then the search engines would use this as a guide that the site is not meant to be indexed. The robots.txt file by default could serve the content.. # go away User-agent: * Disallow: / | |||||||||
Tags | No tags attached. | |||||||||
Git Revision or version number | 0.9.1.0 dev master | |||||||||
Run Mode | Grid (Multiple Regions per Sim) | |||||||||
Physics Engine | ODE | |||||||||
Script Engine | ||||||||||
Environment | .NET / Windows64 | |||||||||
Mono Version | None | |||||||||
Viewer | N/A | |||||||||
Attached Files | ![]() ![]() ![]() | |||||||||
![]() |
|
(0027087) aiaustin (developer) 2014-12-11 06:43 |
Typical messages from firewall probing software that show as red errors are: Robust.exe Console 14:33:18 - [BASE HTTP SERVER]: Handler not found for http request POST /av-centerd 14:33:20 - [BASE HTTP SERVER]: Handler not found for http request POST / 14:34:44 - [BASE HTTP SERVER]: Handler not found for http request SEARCH / 14:34:59 - [BASE HTTP SERVER]: Handler not found for http request GET /flex2gateway/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /messagebroker/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /blazeds/messagebroker/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /lcds/messagebroker/http 14:35:55 - [BASE HTTP SERVER]: Handler not found for http request POST / ------------------------------------- OpenSim.exe Console 14:29:54 - [AGENT HANDLER]: Invalid parameters for agent message /Agent/ 14:33:18 - [BASE HTTP SERVER]: Handler not found for http request POST /av-centerd 14:33:18 - [BASE HTTP SERVER]: Handler not found for http request POST /av-centerd 14:33:20 - [BASE HTTP SERVER]: Handler not found for http request POST / 14:33:20 - [BASE HTTP SERVER]: Handler not found for http request POST / 14:34:34 - [BASE HTTP SERVER]: HttpServer.HttpListener had an exception: An existing connection was forcibly closed by the remote host System.Net.Sockets.Socket Exception (0x80004005): An existing connection was forcibly closed by the remote host at System.Net.Sockets.Socket.EndAccept(Byte[]& buffer, Int32& bytesTransferred, IAsyncResult asyncResult) at System.Net.Sockets.Socket.EndAccept(IAsyncResult asyncResult) at System.Net.Sockets.TcpListener.EndAcceptSocket(IAsyncResult asyncResult) at HttpServer.HttpListenerBase.OnAccept(IAsyncResult ar) 14:34:44 - [BASE HTTP SERVER]: Handler not found for http request SEARCH / 14:34:45 - [BASE HTTP SERVER]: Handler not found for http request SEARCH / 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /flex2gateway/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /messagebroker/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /blazeds/messagebroker/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /lcds/messagebroker/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /flex2gateway/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /messagebroker/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /blazeds/messagebroker/http 14:35:00 - [BASE HTTP SERVER]: Handler not found for http request GET /lcds/messagebroker/http |
(0027092) aiaustin (developer) 2014-12-12 08:52 edited on: 2014-12-12 08:53 |
Here is a typical red error message when a probe comes in from a search engine... I think.. in the past on checking the IP it is a Google search bot. This one reports as a non-existent domain though. 10:44:22 - [AGENT HANDLER]: method GET not supported in agent message /agent/e24 a9015-f5ca-452b-8c95-d32e34cb9d64/, (caller is 220.181.51.102) The /agent/UUID URL is one found, for example, OpenSim discussion forum message lists. |
(0031501) aiaustin (developer) 2016-12-29 12:01 edited on: 2016-12-29 12:05 |
I just want to note again that it would be useful I think to be able to serve http://domainname>:<port>/robots.txt [^] from the core setup for OpenSim for standalones and grids and to allow robots.txt to be provided... perhaps like the default http_404.html and http_500.html files can be. A sample robot.txt that could be included in the bin directory by default is attached. |
(0031502) Mandarinka Tasty (reporter) 2016-12-29 13:26 edited on: 2016-12-29 13:27 |
Hello ) I have prepared solution for you. It works for me. Though, I rather doubt, that core needs that.Please treat it as my personal solution for you. I've written it as kind of exercise. Probably, it can be made in more elegant and proper way. |
(0031503) melanie (administrator) 2016-12-29 13:57 |
Can we please not have a file for this but use string robots = "# go away\nUser-agent: *\nDisallow: /\n"; in code? |
(0031504) Mandarinka Tasty (reporter) 2016-12-29 13:59 |
sure, we can ) one moment, I correct the patch. |
(0031506) Mandarinka Tasty (reporter) 2016-12-29 14:24 |
I've replaced the patch with an appropriate version. |
(0031508) aiaustin (developer) 2016-12-30 02:29 edited on: 2017-01-04 14:09 |
Thanks for incorporating the capability, which will be useful. I note that the changes made allow the REGION server (on the base port used by OpenSim.exe instance) to serve robots.txt, and that does catch some of the incoming probes and web crawlers. But the Grid/Robust service for the main advertised GridURL does not yet serve robots.txt. So, for example with this change in place on my test AiLand (ROBUST) grid... http://ai.vue.ed.ac.uk:8002/robots.txt [^] does not work yet http://ai.vue.ed.ac.uk:9000/robots.txt [^] does work (where 9000 is the OpenSim.ini http_listener_port) I assume a similar change to that made already for OpenSim.exe needs to be incorporated in the relevant file(s) for Robust.exe? |
(0031509) djphil (reporter) 2016-12-30 03:18 |
Although it does not have any protective power, we might take this opportunity to add the humans.txt file too ... Doc available here: http://humanstxt.org/ [^] |
(0031512) aiaustin (developer) 2016-12-30 04:36 edited on: 2016-12-30 04:39 |
@djphil, not with the mechanism used as that has the contents of the robots.txt file built in to the core code as a fixed string, and its just to let web crawlers etc know not to try to index below the root. The humans.txt mechanism you identified definitely needs to have an actual file and be changeable by the grid provider. The main aim of providing robots.txt is to indicate this is not normal web contents and to head off the red errors that show on the console for probes and web crawlers using links they find through things like blogs and mantis entries as a root for their searches. |
(0031513) melanie (administrator) 2016-12-30 05:13 |
Hard -1 on humans.txt Most grids don't want to publish names and there are also anonymous and pseudonymous developers. OpenSim grid URLs are not websites. |
(0031527) aiaustin (developer) 2017-01-04 14:07 edited on: 2017-01-04 14:15 |
@Mandarinka... I wonder if a similar change to the one you prepared and @Melanie incorporated for the OpenSim.exe base http server can also be incorporated in a suitable place for Robust.exe? See my earlier comment on 2016-12-30 10:29. |
(0031528) Mandarinka Tasty (reporter) 2017-01-04 15:36 |
Hello Ai :) I have written idea of the patch. Please let me know, whether it works for you. The patch has been attached: 0001-GridRobotsHandler.patch |
(0031529) Mandarinka Tasty (reporter) 2017-01-04 15:38 |
It uses GridInfoServer, that runs on 8002, defaultly. I've made it intuitevely, so please describe and if there appears need I can deal with it more. |
(0031530) aiaustin (developer) 2017-01-05 01:37 edited on: 2017-01-05 01:42 |
That looks good Mandarinka. I have installed the patch on the AiLand grid into the latest 0.9.1 dev master and both the Robust/grid level and OpenSinm.exe/Region level now serve robots.txt correctly... http://ai.vue.ed.ac.uk:8002/robots.txt [^] http://ai.vue.ed.ac.uk:9000/robots.txt [^] This grid also uses @Diva Wifi which I wanted to check as that serves the root Grid URL in recent versions as well as its original WiFi home page... http://ai.vue.ed.ac.uk:8002/ [^] http://ai.vue.ed.ac.uk:8002/wifi [^] It would be useful if Melanie could check the method used to see that it looks okay and then have it added into dev master so we can completely close this issue. Thanks again. |
(0031531) melanie (administrator) 2017-01-05 05:38 |
The patch isn't suitable for core. It doesn't take into account that grids usually run more than one ROBUST and also may choose to provide GridInfo without using the GridInfoService. So that service may not be running at all or may be running only one of a number of ROBUSTs. The serving of robots.txt needs to be done in the ServicesServerBase or a similarly low level place. |
(0031532) Mandarinka Tasty (reporter) 2017-01-05 09:57 |
Hello :) @Ai I'm happy,that it works for you. Feel free to use it. @Melanie I agree, that method of mine is not universal. Especially,if grid info is served in other way,than typical Opensim grid configuration. I try to prepare it in other way. Thank you for the suggestion concerning using low level service |
(0031535) aiaustin (developer) 2017-01-05 12:27 edited on: 2017-01-06 00:46 |
Great.. thanks to Melanie for the advice. @Mandarinka's earlier patch worked for me as I run a single Robust.exe for all services in my setups for Openvue and AiLand grids. |
![]() |
|||
Date Modified | Username | Field | Change |
2014-12-09 09:06 | aiaustin | New Issue | |
2014-12-09 09:06 | aiaustin | Summary | Suggestion to serve robots.txt from grid/standalonw base URL => Suggestion to serve robots.txt from grid/standalone base URL |
2014-12-11 06:43 | aiaustin | Note Added: 0027087 | |
2014-12-12 08:52 | aiaustin | Note Added: 0027092 | |
2014-12-12 08:53 | aiaustin | Note Edited: 0027092 | View Revisions |
2016-12-29 12:01 | aiaustin | Note Added: 0031501 | |
2016-12-29 12:01 | aiaustin | File Added: robots.txt | |
2016-12-29 12:04 | aiaustin | Git Revision or version number | r/25604 => 0.9.1.0 dev master |
2016-12-29 12:05 | aiaustin | Note Edited: 0031501 | View Revisions |
2016-12-29 13:24 | Mandarinka Tasty | File Added: 0001-Serving-robots.txt-from-bin.patch | |
2016-12-29 13:26 | Mandarinka Tasty | Note Added: 0031502 | |
2016-12-29 13:26 | Mandarinka Tasty | Status | new => patch included |
2016-12-29 13:27 | Mandarinka Tasty | Note Edited: 0031502 | View Revisions |
2016-12-29 13:57 | melanie | Note Added: 0031503 | |
2016-12-29 13:59 | Mandarinka Tasty | Note Added: 0031504 | |
2016-12-29 14:23 | Mandarinka Tasty | File Deleted: 0001-Serving-robots.txt-from-bin.patch | |
2016-12-29 14:23 | Mandarinka Tasty | File Added: 0001-Serving-robots.txt-from-bin.patch | |
2016-12-29 14:24 | Mandarinka Tasty | Note Added: 0031506 | |
2016-12-29 15:18 | melanie | Status | patch included => resolved |
2016-12-29 15:18 | melanie | Resolution | open => fixed |
2016-12-29 15:18 | melanie | Assigned To | => melanie |
2016-12-30 02:29 | aiaustin | Note Added: 0031508 | |
2016-12-30 02:29 | aiaustin | Status | resolved => feedback |
2016-12-30 02:29 | aiaustin | Resolution | fixed => reopened |
2016-12-30 02:31 | aiaustin | Note Edited: 0031508 | View Revisions |
2016-12-30 03:18 | djphil | Note Added: 0031509 | |
2016-12-30 04:36 | aiaustin | Note Added: 0031512 | |
2016-12-30 04:36 | aiaustin | Status | feedback => assigned |
2016-12-30 04:37 | aiaustin | Note Edited: 0031512 | View Revisions |
2016-12-30 04:37 | aiaustin | Note Edited: 0031512 | View Revisions |
2016-12-30 04:39 | aiaustin | Note Edited: 0031512 | View Revisions |
2016-12-30 05:13 | melanie | Note Added: 0031513 | |
2016-12-30 08:12 | aiaustin | Note Edited: 0031508 | View Revisions |
2017-01-04 14:07 | aiaustin | Note Added: 0031527 | |
2017-01-04 14:08 | aiaustin | Note Edited: 0031527 | View Revisions |
2017-01-04 14:09 | aiaustin | Note Edited: 0031508 | View Revisions |
2017-01-04 14:15 | aiaustin | Note Edited: 0031527 | View Revisions |
2017-01-04 14:15 | aiaustin | Note Edited: 0031527 | View Revisions |
2017-01-04 15:35 | Mandarinka Tasty | File Added: 0001-GridRobotsHandler.patch | |
2017-01-04 15:36 | Mandarinka Tasty | Note Added: 0031528 | |
2017-01-04 15:38 | Mandarinka Tasty | Note Added: 0031529 | |
2017-01-05 01:37 | aiaustin | Note Added: 0031530 | |
2017-01-05 01:39 | aiaustin | Note Edited: 0031530 | View Revisions |
2017-01-05 01:39 | aiaustin | Note Edited: 0031530 | View Revisions |
2017-01-05 01:41 | aiaustin | Note Edited: 0031530 | View Revisions |
2017-01-05 01:42 | aiaustin | Note Edited: 0031530 | View Revisions |
2017-01-05 01:42 | aiaustin | Note Edited: 0031530 | View Revisions |
2017-01-05 05:38 | melanie | Note Added: 0031531 | |
2017-01-05 09:57 | Mandarinka Tasty | Note Added: 0031532 | |
2017-01-05 12:27 | aiaustin | Note Added: 0031535 | |
2017-01-05 12:28 | aiaustin | Note Edited: 0031535 | View Revisions |
2017-01-06 00:46 | aiaustin | Note Edited: 0031535 | View Revisions |
Copyright © 2000 - 2012 MantisBT Group |