User:JeffKelley/Automating grid management

From OpenSimulator

< User:JeffKelley(Difference between revisions)
Jump to: navigation, search
(Created page with "'''THIS IS A DRAFT''' Here is one of many solutions to manage a Linux grid from one unique bin folder. ==OpenSim configuration== First, we parametrize our configuration fi...")
 

Latest revision as of 11:28, 18 August 2015

THIS IS A DRAFT


Here is one of many solutions to manage a Linux grid from one unique bin folder.

Contents

[edit] OpenSim configuration

First, we parametrize our configuration files to use shell environment variables. We will use only one variable named SIMULATOR. The expression ${Environment|SIMULATOR} can then be used in various parts of OpenSim.ini and GridCommon.ini.

[Environment]
    SIMULATOR=""
 
[Startup]
    ConsolePrompt = "Simulator ${Environment|SIMULATOR} (\R) "
    regionload_regionsdir = "path_to_regions/simul${Environment|SIMULATOR}"
    PIDFile = "/tmp/grid/simu${Environment|SIMULATOR}.pid"
 
[Network]
    http_listener_port = 900${Environment|SIMULATOR}

With such a configuration, the listening port for simulator N will be 900N. [ Note: We assume a one-digit simulator number. If you need more than 10 simulators, you may use 90${Environment|SIMULATOR} and pass a two-digit SIMULATOR variable. ] The regions file for simulator N is located inside path_to_regions/simulN, so you should have this structure on disk :

path_to_regions
   simul1
       Regions.ini
   simul2
       Regions.ini
   ......

If you are using separate databases for each simulator (this is a design choice), then you can parametrize the ConnectionString in GridCommon.ini this way :

[DatabaseService]
    ConnectionString = "Data Source=localhost;Database=sim${Environment|SIMULATOR};User ID=xxx;Password=xxx;"


[edit] Using GNU screen

At the hearth of the management system is the launch script. We take care to label each session with the -S option.

#!/usr/bin/perl
 
# Starting a grid process
# Usage : launch [-d] task_name (robust, simulN)
 
use strict;
use Getopt::Std;
use POSIX qw(strftime);
 
my %OPTS; getopts ('d', \%OPTS);
my $dflag = '-d' if %OPTS->{d};
my ($proc, $dir, $exe, $arg);
 
$proc = $ARGV[0];
$bin = '/home/me/grid'; # Where is the bin folder
 
$exe = 'Robust.exe'	if $proc =~ m/robust/;
$exe = 'OpenSim.exe'	if $proc =~ m/simul(.*)/;
 
$arg = 'Robust.HG.ini'	if $proc =~ m/robust/;
$arg = 'OpenSim.ini'	if $proc =~ m/simul(.*)/;
 
die "Invalid proccess\n" unless defined $exe;
 
my $now = strftime "%F %H:%M:%S", localtime;
printf  "$now Launching $proc\n";
 
chdir $bin or die "Cannot chdir $bin";
 
$ENV{SIMULATOR} = $1 if $proc =~ m/simul(.*)/;
$ENV{PATH} .= ':/usr/local/bin'; # MONO is here
$ENV{LANG}='C';	# Pour le point décimal
 
my $exe = "mono $exe -inifile=$arg";
my $cmd = "screen -S $proc -t $proc $dflag -m $exe";
print "$cmd\n"; exec $cmd;

Launching the grid is a matter of issuing this sequence of commands :

launch robust
launch simul1
launch simul2
.............

Then:

$ screen -ls
There are screens on:
	26039.simul6	(Multi, attached)
	24972.simul4	(Multi, attached)
	24676.simul1	(Multi, attached)
	24652.robust	(Multi, attached)
	24750.simul2	(Multi, attached)
	29220.simul3	(Multi, attached)
	25481.simul5	(Multi, attached)
8 Sockets in /var/run/screen/S-jyb.

This is not very friendly, so we write the procs script

#!/usr/bin/perl
 
# List running grid processes
# Only detached if -d
 
use Getopt::Std;
 
getopts ('d', \%OPTS);
$detached = $OPTS{d};
 
@screens = split "\n", `screen -ls`;
@wanted;
 
for (@screens) {
 	next unless $_ =~ /(\S*)\.(\S*).*\((.*)\)/;
	($pid,$name,$attr) = ($1,$2,$3);
	$is_detached = ($attr =~ /detached/i);
	push @wanted,$name unless $detached && !$is_detached;
}
 
for (sort @wanted) {
	print "$_\n";
}

Then:

$ procs
robust
simul1
simul2
simul3
simul4
simul5
simul6

[edit] Automating

Create a file named proclist:

# processes to launch
 
robust
simul1
simul2
simul3
simul4
simul5
simul6


It is easy now to write a couple of scripts that launch the next wanted process, or re-attach the next detached process.


Script s :

#!/usr/bin/perl
 
# Launch first missing process from proclist
 
use strict;
 
my @running = `procs`; # List of running screens
 
# Get wanted processes list
 
my $interactive = -t STDIN && -t STDOUT;
my $dflag = '-d' unless $interactive;
my $proclist = "$ENV{HOME}/proclist";
 
open IN, "<$proclist" or die "Cannot open $proclist\n";
 
my @procs = <IN>;
 
# Find first not running process
 
for my $proc (@procs) {
	chomp $proc;
	next if $proc eq '';
	next if $proc =~ /^#/;
 
	my $running = grep /$proc/,@running;
	exec "bin/launch $dflag $proc" unless $running;
}
 
print "Nothing to start\n" if $interactive;


Script r :

#!/usr/bin/perl
 
# Reattach first detached screen process
 
@detached = `procs -d`; # List detached screens
die "Nothing to reattach\n" unless @detached;
 
$process = shift @detached;
 
printf "Reattaching $process\n";
sleep 1; exec "screen -r $process";

Starting your grid is now as simple as opening as many terminals as you have grid processes and typing s in each. This can be further automated by having a cron job calling s every minute. The job will launch any missing process from proclist. As an added benefit, it will relaunch crashed simulators.

[edit] Adding a simulator

Add a line to proclist

# processes to launch
 
robust
simul1
simul2
simul3
simul4
simul5
simul6
simul7

Create a folder path_to_regions/simul7 and put the Regions.ini inside.

That's all.

Personal tools
General
About This Wiki