[Opensim-dev] Modifying the networking stack (UNCLASSIFIED)
Justin Clark-Casey
jjustincc at googlemail.com
Wed Nov 19 00:03:30 UTC 2014
Currently, pCampbot turns off the libomv asset cache and has its own texture/mesh cache. This is currently very similar
- there's only one cache for all bots although we mere flag our receipt of the textures/mesh and don't keep the actual
data. Of course, this behaviour can change in the future.
Default agent update in libomv is currently 2 per second by default (Settings.DEFAULT_AGENT_UPDATE_INTEVAL). From
memory, I thought I observed Singularity post about 1 per second but I could be wrong (Diva may know what it really is
now). That was only a couple of observations so may change under different conditions or different viewers.
On 18/11/14 22:35, Dahlia Trimble wrote:
> One issue with libomv bots (and I'm not sure if this applies to pCampbot or not) is that running multiple bots from the
> same installation of libomv results in them all sharing the same asset cache so asset fetches by such clients will be
> much lower than normal viewers, perhaps even by an order of magnitude or more.
>
> Another issue is that libomv AgentUpdate runs at a fixed rate of 10/second but many viewers send at a rate which is
> roughly the same as frame draw rate. Bots in general don't really need a high AgentUpdate rate and 10/second is probably
> a good choice. While I agree with Diva that high rates in general are undesirable due to the extra work they impose on
> the region UDP stack, I question how much they can be reduced, or even made "smart" without degrading user experience
> while interacting in a region over a slow or lossy network connection. Some user applications such as games or
> interactive simulations may require fast response to controls and could suffer if such needs are not considered while
> engineering a networking system.
>
> On Tue, Nov 18, 2014 at 1:51 PM, Justin Clark-Casey <jjustincc at googlemail.com <mailto:jjustincc at googlemail.com>> wrote:
>
> These are all fixable issues, either with pCampbot improvements or distributing pCampbot instances amongst more
> machines. I expect pCampbot will be built upon to address these points as required. And this year I successfully
> used 4 Amazon c2 large instances for bot running so a more realistic network load means spinning up more cloud
> instances.
>
> I agree that unless you can reproduce an issue you are shooting in the dark with any changes. And organizing enough
> real people to reproduce issues on a regular basis and without a huge amount of confusing other behaviour is
> impossible in practice.
>
>
> On 14/11/14 16:46, Maxwell, Douglas CIV USARMY ARL (US) wrote:
>
> Classification: UNCLASSIFIED
> Caveats: NONE
>
> Dr. Lopez, thank you for sharing your paper. Can you tell me where it was
> peer reviewed and published? I would like to reference it in my
> dissertation.
>
> On the topic of bots, the MOSES team has not been able to compose a NPC
> agent or bot that accurately replicate the footprint of a human agent on the
> simulator. We believe this is for many reasons:
>
> 1) Bots are usually composed on a server on the same network, not dispersed
> across the internet. The bots should be software throttled and noise
> introduced into their connections to approximate random access.
>
> 2) Bots aren't using full clients, so they are not filling caches and
> making the same scene requests as humans in graphical clients.
>
> 3) Bots are usually homogenous. They need to be randomly dressed, have
> random attachments, and have random inventories.
>
> 4) Bots need to move randomly and collide with objects in the scene and
> with each other.
>
> 5) Bots need to randomly chat with each other and broadcast locally.
>
> We think we can create a NPC solution that satisfies these issues. Will
> take some thought and development. Has anyone come close to this?
>
> Goal: Compose bots/NPCs that can approximate the loads of humans within 90%
> certainty. Meaning if we load 100 of these artificial agents into the
> MOSES, we are certain that it will accurately behave as if at least 90
> humans are logged in.
>
> IMHO, if you can't assign a reliability to a test, then you are just wasting
> your time. This is basic V&V tenants.
>
> v/r -douglas
>
> Douglas Maxwell, MSME
> Science and Technology Manager
> Virtual World Strategic Applications
> U.S. Army Research Lab
> Simulation & Training Technology Center (STTC)
> (c) (407) 242-0209 <tel:%28407%29%20242-0209>
>
>
>
> -----Original Message-----
> From: opensim-dev-bounces at __opensimulator.org <mailto:opensim-dev-bounces at opensimulator.org>
> [mailto:opensim-dev-bounces at __opensimulator.org <mailto:opensim-dev-bounces at opensimulator.org>] On Behalf Of
> Diva Canto
> Sent: Friday, November 14, 2014 11:05 AM
> To: opensim-dev at opensimulator.org <mailto:opensim-dev at opensimulator.org>
> Subject: Re: [Opensim-dev] Modifying the networking stack
>
> On 11/14/2014 6:23 AM, Michael Heilmann wrote:
>
> Thanks for the responses. I'll go into a little more detail:
>
> We have been running several profilers against OpenSimulator on the
> MOSES grid, and on my development machine. The tests were to examine
> the loading on the server under several different loads, specifically
> mesh and physics loads. What we found appears to be that no matter
> what kind of load we placed on the region, even to the point of
> becoming unresponsive due to physics and mesh, that scripting and
> physics load were nowhere near the amount of time spent in
> OpenSim.Region.ClientStack.__LindenUDP once we had more than one or two
> avatars logged in. We know from previous investigations at our
> firewall that network traffic for OpenSim is not that heavy,
> especially with low numbers of users.
>
>
> If this is a problem, and you are running a recent-ish version of core
> OpenSim, it sounds like some misconfiguration somewhere. Back in the summer
> of 2013 we had a problem with the server running OSCC'13; the kernel was
> configured to run in some sort of special mode that was making everything
> run badly and unpredictably. We fixed the kernel configuration, and suddenly
> things started running much more smoothly-- I don't remember the details,
> but Nebadon may clarify things.
>
> OpenSim these days can handle 50 people on a single simulator without much
> trouble. If you look at figure 7 of my paper
> (http://www.ics.uci.edu/~__lopes/documents/summersim14/__gabrielova_lopes_prepri
> <http://www.ics.uci.edu/~lopes/documents/summersim14/gabrielova_lopes_prepri>
> nt.pdf)
> you will see the quantification of "without much trouble." I suggest that
> you reproduce my experimental conditions with pCamBot and check whether your
> numbers are very different from ours. If they are very different, then
> there's definitely something odd in your setup, as we were able to reproduce
> these numbers in several machines. Feel free to contact me directly for
> details about pCamBot configuration.
>
> Bots aren't real viewers, but they are much better for measuring things
> systematically and detecting problems and bottlenecks than relying on real
> users driven by real people. The performance you get with pCamBot will be
> correlated with the performance you get with real users.
>
>
> I ran several Wireshark captures against a Firestorm viewer logging
> into the MOSES public grid ABWIS region, where we hold our office
> hours. I saw that with our current configuration, all traffic between
> the server and my client, with the exception of http CAPS and fsapi
> calls, were UDP traffic. This is not immediately concerning, as we
> have simian serve our mesh and textures directly. The messages are
> mostly binary information, so I could not examine closely, but I did
> see a lot of messages containing identical ASCII strings, such as the
> name of my avatar.
>
>
> Hard to say what you saw, but I bet those are the AgentUpdate messages that
> I mentioned before. The viewer sends at least 10/sec. At points, the viewer
> sends much more than 10/sec, up to 60/sec. Again, take a look at my paper
> for understanding what those are, and how OpenSim deals with them since
> OSCC'13.
>
> As I said before, it would be nice to understand why the viewer is so eager
> to blabber its status to the server when nothing is going on.
>
>
> My primary concern is the amount of time spent handling networking,
> not necessarily the networking its-self. But there is at least a
> portion of messages on the UDP pipeline that are either reliable, or
> perhaps should be; and re-implementing a reliable transport over udp
> introduces load at the application layer, instead of letting a
> low-level reliable transport such as tcp handle it. I went to
> university with a guy who implemented a java networking library
> completely over UDP, believing that it was faster than a normal TCP
> socket; but he was neglecting that the networking hardware handles the
> ACK and retransmission transparently, and without needing for the
> messages to be handled manually by the application.
>
> This may just be my opinion, but since I was going to be ecamining the
> network stack anyways, and typically in a client-server scenario the
> ability to maintain a persistent reliable connection where the server
> can push important events to the client, that it would be a good
> idea. The points about network throttling and QoS are taken, but
> wouldn't they also typically affect the UDP stream? Working on MOSES I
> have plenty of problems dealing with external users who operate on
> restricted networks, and they cannot see traffic aside from 80 and 443
> without dealing with their own IT personnel. The fact that it is HTTP
> over TCP instead of raw TCP makes no difference once it is on a
> non-standard HTTP port.
>
> I agree that it would be more prudent to look at improving the
> websocket code and the http server, rather than replace it with a raw
> TCP socket, especially given that there are multiple plugins, such as
> jsonsimstats, that use the http functionality directly.
>
> I hope that explains my position a little better. I would love to
> hear if there are other plans/ideas in the community to address
> time-sinks like this one, networking simply appears to us as a good
> starting point to increase performance and scalability of the system.
>
>
> _________________________________________________
> Opensim-dev mailing list
> Opensim-dev at opensimulator.org <mailto:Opensim-dev at opensimulator.org>
> http://opensimulator.org/cgi-__bin/mailman/listinfo/opensim-__dev
> <http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev>
>
> Classification: UNCLASSIFIED
> Caveats: NONE
>
>
>
>
> _________________________________________________
> Opensim-dev mailing list
> Opensim-dev at opensimulator.org <mailto:Opensim-dev at opensimulator.org>
> http://opensimulator.org/cgi-__bin/mailman/listinfo/opensim-__dev
> <http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev>
>
>
>
> --
> Justin Clark-Casey (justincc)
> OSVW Consulting
> http://justincc.org
> http://twitter.com/justincc
>
> _________________________________________________
> Opensim-dev mailing list
> Opensim-dev at opensimulator.org <mailto:Opensim-dev at opensimulator.org>
> http://opensimulator.org/cgi-__bin/mailman/listinfo/opensim-__dev
> <http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev>
>
>
>
>
> _______________________________________________
> Opensim-dev mailing list
> Opensim-dev at opensimulator.org
> http://opensimulator.org/cgi-bin/mailman/listinfo/opensim-dev
>
--
Justin Clark-Casey (justincc)
OSVW Consulting
http://justincc.org
http://twitter.com/justincc
More information about the Opensim-dev
mailing list