[Opensim-dev] Proposal: Implement a de-duplicating core ROBUST asset service

Justin Clark-Casey jjustincc at googlemail.com
Fri Mar 9 03:41:06 UTC 2012


On 08/03/12 22:00, Rory Slegtenhorst wrote:
> Dear list,
>
> I hardly ever post (im not an opensim developer so to speak), but in this case I'd like to give my €0,02 worth ;)
>
> @Justin
> Can't we do the data de-duplication on a database level? Eg find the duplicates and just get rid of them on a regular
> interval (cron)?

This would be enormously intricate.  Not only would you have to keep rescanning the entire asset db but it adds another 
moving part to an already complex system.

> Or just migrate to a db schema to include hashing support making it easier to identify?

This is effectively what I am doing.

> Couldn't you manage it as a plugin for the current asset service to perform data de-duplication on the fly?

I might make chain the deduping service to the existing asset service so that new assets go in the deduping tables 
whilst existing assets can still be used.  There would still be an external migrator but this could be made to 
effectively move rather than copy assets over time (copying would cause space problems with a large amount of assets). 
Migration won't need to be done all-at-once.

On another note, just for information I was toying with the idea of making the new service store data on the filesystem 
instead of as blobs.  But this introduces another element of complexity (e.g. having to backup both db and a directory) 
which I'm not regarding as optimal for a default beginner service.  I would recommend that anybody wanting filesystem 
asset storage for a big grid use something like coyled's SRAS.  There might be a filesystem storage option in the core 
assetservice in the future.

An external migrator tool might be able to target any service that offers the IAssetService interface (though the source 
service has to be more directly available since understandably there's no IAssetService way to just dump out assets 
without knowing the IDs).

> Just a few suggestions...
>
> And when it comes to space, there's plenty options available. I'd love to see some support for Hadoop HDFS/HBase. There
> must be some way to leverage the scalability of the data using some BigTable implementation right? Is No-SQL an option?
> Cassandra seems to scale nicely :)

All these options are for external services to implement, as far as I'm concerned.  From my perspective, the core 
bundled ROBUST services are for light and perhaps medium use (say more than a 4 simulator grid).

> This could take OpenSim to a whole new level.
>
> Still, 30% is a lot...
>
> Sincerely,
> Rory Slegtenhorst
> rory dot slegtenhorst at gmail dot com
>
>
>
> _______________________________________________
> Opensim-dev mailing list
> Opensim-dev at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/opensim-dev


-- 
Justin Clark-Casey (justincc)
http://justincc.org/blog
http://twitter.com/justincc



More information about the Opensim-dev mailing list