[Opensim-dev] JSON or XML for serialization in the OpenSim database?

Justin Clark-Casey jjustincc at googlemail.com
Fri Jul 9 00:30:45 UTC 2010


On 06/07/10 18:55, Hurliman, John wrote:
> I think the original question has been misunderstood. We are using the LLSD type system through the OpenMetaverse.StructuredData.dll library. If we want to decide whether that is a good approach or not we can open the discussion, but the question originally posed in this thread was a lot smaller scope. Under the assumption that we are using LLSD (primarily the OSDMap class), which serialization format should be used? There are three possibilities: XML, JSON, and binary. Since we're dealing with an LLSD representation on the OpenSim side, there are no DTDs here. There is no XPath/XQuery, no Javascript evaluation, no "conversion skipping" by saving raw data from clients straight to the database and sending it back without going to an intermediary representation in memory (it's arguable whether that is a good idea at all, but a moot point since it's not how our current type system works).
>

Yes, in principle this is really the narrow question of a serialization format to and from the region database, not for 
communication with viewers or anything like that.

One can ask the question of whether reusing the OSD serialization in a different context is a good idea at all, 
particularly if it changes in the future in a way that means we manually have to reimplement older code to read older 
representations.  This is where a version number could be near essential to do it in a sane way.

> From my perspective, the only things that have been mentioned so far that are worthy of consideration are human-readability (debugging and digging through MySQL rows are par for the course here) and performance (both in storage/bandwidth cost and parsing/serialization tax). If the goal is to switch to pure XML and use DTDs instead of LLSD with LLIDL or an equivalent form of documentation then let's put that on the table instead of comparing apples to oranges.
>
> My vote for the original question goes to JSON, as a good balance between performance and readability. I think the performance impact is being underestimated. I have a test implementation of linkset serialization using LLSD/JSON that shrinks the on the disk representation of ~50KB objects down to a few KB. The parsing time also goes down in a similar fashion. The real-world (virtual world?) impact here means shaving hundreds of milliseconds off border crossings, making cross-grid functionality like HyperGrid work better, and making serializations like OAR/IAR more usable. The performance gap is not something to hand-wave away.

We're actually talking about 4 different things here - serialization to a field in the region database, serialization as 
an asset in inventory, serialization for border crossing and serialization to external formats (such as IAR/OAR).  At 
the moment, the default XML serialization is used for all but the first.  In some cases (such as IAR/OAR) default 
serialization is not a good thing at all externally, but it does make it much easier when new fields are added.

In fact, by default the LLSD region serialization for a media texture would be different from whatever the .NET 
automatic XML serializer comes up with for the OSD MediaObject.  So arguably, that is a different problem.  Here, trying 
to make it JSON instead of the default (non-LLSD) XML serialization could mean some nasty, nasty hacking, in a context 
where this is better tackled holistically (by making the whole serialization JSON) rather than piecemeal.

One could argue that in the interim, what should happen is that the serialization of a MediaObject in this context 
should be as a byte[] rather than XML - this is what happens for TextureEntry where there's a byte[] TextureEntry field 
in PrimitiveBaseShape and a Textures field that actually returns proper Primitive.TextureEntry objects but is 
[XmlIgnore].  If done in the same manner as TextureEntry, this means that manipulating a MediaObject would always 
require de/serialization from the embedded byte[] representation first.

Then it really could be the binary representation that is put in the database rather than an XML/JSON representation. 
This is really not a solution that I like much at all but it would be consistent with what we do for TextureEntry and 
binary is the most efficient format of all, right?

>
> John
>
> -----Original Message-----
> From: opensim-dev-bounces at lists.berlios.de [mailto:opensim-dev-bounces at lists.berlios.de] On Behalf Of Frisby, Adam
> Sent: Monday, July 05, 2010 5:14 PM
> To: opensim-dev at lists.berlios.de
> Subject: Re: [Opensim-dev] JSON or XML for serialization in the OpenSim database?
>
> Having just worked on a JSON project myself internally - I personally developed a bit of a loathing for the format. I'm personally partial to XML, ideally with a corresponding DTD.
>
> Adam
>
>> -----Original Message-----
>> From: opensim-dev-bounces at lists.berlios.de [mailto:opensim-dev-
>> bounces at lists.berlios.de] On Behalf Of Teravus Ovares
>> Sent: Monday, 5 July 2010 4:23 PM
>> To: opensim-dev at lists.berlios.de
>> Subject: Re: [Opensim-dev] JSON or XML for serialization in the
>> OpenSim database?
>>
>> Not a whole lot of feedback here yet, maybe people are on a long
>> weekend type camping vacation..
>>
>> I'm partial to OSD/json, myself.    I'd also like to, at some point,
>> get a version number in there along with a definition of the format
>> for people who want to write integration tools..    however, that last
>> bit may be more of a 1.0 thing.
>>
>> I think a lot of tools are going to go the way of JavaScript in the
>> future for various reasons...   one being that..   it's generally
>> implemented in all web enabled devices.   Computers, 'tablets', 'smart
>> phones'...   Another reason is it's more compact, while still being
>> fairly human readable.     One last reason that I can think of at this
>> moment is there are no external dependencies that 'get lost and turn
>> into a 404', like with XML Schemas.   I've done several XML based
>> integrations using REST and noted that in 55% of the cases, the
>> defining schema is a 404 which makes validation and automatic creation
>> of XML Serialization classes impossible.  Worse, in 15% of the cases,
>> extensions are defined in the schema and then used in the XML..  Only,
>> you won't ever know what tags and parameters the extensions provide
>> because the schema is 'missing'.
>>
>> Regards
>>
>> Teravus
>>
>> On Sun, Jul 4, 2010 at 8:28 PM, Justin Clark-Casey
>> <jjustincc at gmail.com>  wrote:
>>> Hi folks,
>>>
>>> As part of the media-on-a-prim implementation, I'm serializing the
>>> parameters for a media texture to the database.  This seems better
>> than
>>> creating new database fields or even a whole new table for these
>> parameters,
>>> both because there are lots of them (url, scaling, controls,
>> whitelist,
>>> etc.) and because different future virtual environments may want to
>> store
>>> different things.
>>>
>>> I'm going to serialize them as an OSDArray or MediaEntrys using the
>>> libopenmetaverse library.  However, the question then becomes
>>> whether
>> to use
>>> the JSON representation or the XML representation.
>>>
>>> I tend to prefer XML for storage representations.  I believe that
>> it's
>>> somewhat more human readable and that there is better tool support
>> for
>>> manipulating it.  However, I know other people would prefer storage
>> in JSON
>>> and I accept that serialization/deserialization there may be
>>> slightly faster.
>>>
>>> The only other example of serialization that I know of in OpenSim
>> currently
>>> is that of SceneObjectGroups into inventory, which encompasses
>>> object properties, object inventory properties and script state.
>>> This is
>> done in
>>> XML and media entries would become part of that serialization.
>>>
>>> If there's a majority preference for JSON I don't mind using that
>> instead,
>>> though I would want a justification for going this route rather than
>> XML.
>>>   If there's no real argument then I will go with XML.
>>>
>>> Also, I believe that we should try and be consistent, so picking one
>> or the
>>> other now should make it more likely that the same approach would be
>> used
>>> for the next serialization case.
>>>
>>> Regards,
>>>
>>> --
>>> Justin Clark-Casey (justincc)
>>> http://justincc.org
>>> http://twitter.com/justincc
>>> _______________________________________________
>>> Opensim-dev mailing list
>>> Opensim-dev at lists.berlios.de
>>> https://lists.berlios.de/mailman/listinfo/opensim-dev
>>>
>> _______________________________________________
>> Opensim-dev mailing list
>> Opensim-dev at lists.berlios.de
>> https://lists.berlios.de/mailman/listinfo/opensim-dev
> _______________________________________________
> Opensim-dev mailing list
> Opensim-dev at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/opensim-dev
> _______________________________________________
> Opensim-dev mailing list
> Opensim-dev at lists.berlios.de
> https://lists.berlios.de/mailman/listinfo/opensim-dev
>


-- 
Justin Clark-Casey (justincc)
http://justincc.org
http://twitter.com/justincc



More information about the Opensim-dev mailing list