JSON Archives for Boost.serialization

Kasun Indrasiri

21 Mar 2008 21 Mar '08

4:47 p.m.

Hi all, I'm interested in the Boost JSON archive for Boost.serialization project which is presented in the GSoC2008. I have had good exposure to C/C++ during my academic life and projects in the past. The main motivation to select the JSON archive project is that I have developed a JSON parser for Apache AXIS2/C web service engine and I successfully completed the project. In that case the parser was written in C but the task was similar to the requirements of the Boost.serialization. Therefore I would like to contribute to the Boost C++ community through my project. I would like to gain the experience of the Boost.serialization and spirit as a preparation to the project and it would be a great experience to work with boost C++. I also a fan of open source software and currently, an active contributor(Non-Committer) to *Apache Axis2/C and Apache Rampart/C*. I would like to share my ideas through the Boost mailing lists. *As an initial step I went through the Boost Spirit and Serialization and gain the basic knowledge of those things. Currently I do have a pretty simple question that in the JSON Archives do we have to worry about the relevant XML mapping criteria (Badgerfish or Mapped) because JSON may use as an alternative to XML. And also there are few areas that JSON is really lack in supporting some requirements (basically in as an alternative to XML) of the real world applications.* Thanks. Kasun Indrasiri.

Show replies by date

Kasun Indrasiri

22 Mar 22 Mar

6:57 a.m.

New subject: [gsoc 2008]JSON Archives for Boost.serialization

Jeremy Maitin-Shepard

4:55 p.m.

"Kasun Indrasiri" <kasun147@gmail.com> writes:

...

Hi all, I'm interested in the Boost JSON archive for Boost.serialization project which is presented in the GSoC2008. I have had good exposure to C/C++ during my academic life and projects in the past. The main motivation to select the JSON archive project is that I have developed a JSON parser for Apache AXIS2/C web service engine and I successfully completed the project. In that case the parser was written in C but the task was similar to the requirements of the Boost.serialization. Therefore I would like to contribute to the Boost C++ community through my project.

It is important to realize that Boost Serialization, even when it uses e.g. the XML archive format, is not designed to produce archives that conform to any particular format/schema. In particular, the only real guarantee is that stream produced by serialization can be read back by de-serialization if the same (or a compatible) version of Boost Serialization is used and a compatible sequence of serialization instructions is used. In the case of the XML archive format, the file produced by serialization happens to conform to XML syntax, but that fact is nearly irrelevant as in all likelihood the archive will still be easily readable neither by humans nor by any external tools (except for trivial tools that merely display the XML structure). Thus, it is not clear what advantage a JSON-format archive would offer. The most obvious use for JSON is for communicating with a program written in JavaScript, but then it would be necessary to follow a particular format so that the JavaScript program could do something useful with the data, and therefore Boost Serialization is not the right tool for the job. -- Jeremy Maitin-Shepard

Jeff Garland

5:10 p.m.

Jeremy Maitin-Shepard wrote:

...

"Kasun Indrasiri" <kasun147@gmail.com> writes: >

...

Thus, it is not clear what advantage a JSON-format archive would offer. The most obvious use for JSON is for communicating with a program written in JavaScript, but then it would be necessary to follow a particular format so that the JavaScript program could do something useful with the data, and therefore Boost Serialization is not the right tool for the job.

Programs written in C++ often need to inter communicate with programs written in other languages. Or they need to store data structures in a form that can be processed by programs written in another language. JSON is now a common format for doing this and has parsers any many languages. Last but not least, even if the program is all C++ some folks would prefer a recognizable and widely used format -- the serialization 'proprietary formats' don't qualify on that score. Jeff

Sohail Somani

6:56 p.m.

On Sat, 22 Mar 2008 10:10:35 -0700, Jeff Garland wrote:

...

Programs written in C++ often need to inter communicate with programs written in other languages. Or they need to store data structures in a form that can be processed by programs written in another language. JSON is now a common format for doing this and has parsers any many languages.

The main problem as I see it anyway, is that even though there are many parsers, there is a boost-serialization-specific way to interpret the data. I think it is possible to write two different types of JSON archives: one that is meant to interface with the outside world and another that is just another proprietary serialization format. If you look at the XML archive as an example, it is clear that any non boost-serialization processor needs to do specific things to understand the output. Specifically, the presence of object graphs is what I would see as the biggest hurdle. I think if you want the JSON archive to interface with the outside world, you should forgo object graph support. Or atleast support both modes. Here is an example of the xml archive output: http://boost.org/libs/serialization/example/demo_save.xml I'd be interested in what other people have to say about this. -- Sohail Somani http://uint32t.blogspot.com

Jeremy Maitin-Shepard

23 Mar 23 Mar

6:46 a.m.

Sohail Somani <sohail@taggedtype.net> writes:

...

On Sat, 22 Mar 2008 10:10:35 -0700, Jeff Garland wrote:

...
Programs written in C++ often need to inter communicate with programs written in other languages. Or they need to store data structures in a form that can be processed by programs written in another language. JSON is now a common format for doing this and has parsers any many languages.

...

The main problem as I see it anyway, is that even though there are many parsers, there is a boost-serialization-specific way to interpret the data. I think it is possible to write two different types of JSON archives: one that is meant to interface with the outside world and another that is just another proprietary serialization format.

...

If you look at the XML archive as an example, it is clear that any non boost-serialization processor needs to do specific things to understand the output. Specifically, the presence of object graphs is what I would see as the biggest hurdle.

...

I think if you want the JSON archive to interface with the outside world, you should forgo object graph support. Or atleast support both modes.

I think really Boost serialization just isn't the tool for the job if you want to produce an archive that can be read by something other than boost serialization. -- Jeremy Maitin-Shepard

Sohail Somani

7:51 p.m.

On Sun, 23 Mar 2008 02:46:35 -0400, Jeremy Maitin-Shepard wrote:

...

...
If you look at the XML archive as an example, it is clear that any non boost-serialization processor needs to do specific things to understand the output. Specifically, the presence of object graphs is what I would see as the biggest hurdle.

...
I think if you want the JSON archive to interface with the outside world, you should forgo object graph support. Or atleast support both modes.

I think really Boost serialization just isn't the tool for the job if you want to produce an archive that can be read by something other than boost serialization.

To reiterate, its really easy to write a JSON archive that operates just like current archives, but the thing to determine is whether Boost Serialization can inter-operate nicely with the outside world. I think the answer is yes, with limitations. IMHO, any GSoC application should address this, but I am not reviewing them so don't listen to me ;-) -- Sohail Somani http://uint32t.blogspot.com

Jeff Garland

9:08 p.m.

Sohail Somani wrote:

...

On Sun, 23 Mar 2008 02:46:35 -0400, Jeremy Maitin-Shepard wrote:

...
...
If you look at the XML archive as an example, it is clear that any non boost-serialization processor needs to do specific things to understand the output. Specifically, the presence of object graphs is what I would see as the biggest hurdle. I think if you want the JSON archive to interface with the outside world, you should forgo object graph support. Or atleast support both modes. I think really Boost serialization just isn't the tool for the job if you want to produce an archive that can be read by something other than boost serialization.

To reiterate, its really easy to write a JSON archive that operates just like current archives, but the thing to determine is whether Boost Serialization can inter-operate nicely with the outside world. I think the answer is yes, with limitations. IMHO, any GSoC application should address this, but I am not reviewing them so don't listen to me ;-)

I'm reviewing them (as are others listening here) and really suggest they listen to you :-) I was actually unaware of the object graph limitations in JSON. Of course it turns out there's at least one proposal to fix these problems: http://www.jspon.org/?mode=html&noscript=true Or the project could specify limitations to the types that can be serialized in the archive. That's up to the students to propose... Jeff

Sohail Somani

10:14 p.m.

On Sun, 23 Mar 2008 14:08:17 -0700, Jeff Garland wrote:

...

...
To reiterate, its really easy to write a JSON archive that operates just like current archives, but the thing to determine is whether Boost Serialization can inter-operate nicely with the outside world. I think the answer is yes, with limitations. IMHO, any GSoC application should address this, but I am not reviewing them so don't listen to me ;-)

I'm reviewing them (as are others listening here) and really suggest they listen to you

LOL! Thanks for the compliment!

...

I was actually unaware of the object graph limitations in JSON. Of course it turns out there's at least one proposal to fix these problems:

http://www.jspon.org/?mode=html&noscript=true

Or the project could specify limitations to the types that can be serialized in the archive. That's up to the students to propose...

Oh, I'm liking the odds of this succeeding. I am pretty sure that a summer of 6 hrs/day is more than enough to implement a good chunk of a json_[io]archive and/or jspon_[io]archive depending on the availability of parsers. If the student was to concentrate solely on jspon, one of the tasks could be writing a working parser for another language (Java? Python?) to get the ball rolling... Just some pie-in-the-sky for Sunday. IMHO, this has the most promise of providing all features of Boost Serialization. While I have your attention, please don't make the library header-only. :-D -- Sohail Somani http://uint32t.blogspot.com

Esteve Fernandez

24 Mar 24 Mar

11:40 a.m.

El Domingo 23 Marzo 2008 22:08:17 Jeff Garland escribió:

...

I'm reviewing them (as are others listening here) and really suggest they listen to you :-) I was actually unaware of the object graph limitations in JSON. Of course it turns out there's at least one proposal to fix these problems:

http://www.jspon.org/?mode=html&noscript=true

Or the project could specify limitations to the types that can be serialized in the archive. That's up to the students to propose...

Yep, on the other hand YAML doesn't have these limitations. Actually it supports class versioning (well, sort of) and object referencing (with "&") out of the box. Although it hasn't been discussed earlier, can I propose a YAML archive for Boost.serialization? The only problem I see is that there's no Boost.Spirit parser for YAML and its syntax is more complex than the JSON one. However, there's an official implementation in C [1], used by all the existing YAML parsers. But I guess a Spirit-based parser is nicer. Cheers. 1 - http://pyyaml.org/wiki/LibYAML

Sohail Somani

2:40 p.m.

On Mon, 24 Mar 2008 12:40:27 +0100, Esteve Fernandez wrote:

...

Although it hasn't been discussed earlier, can I propose a YAML archive for Boost.serialization? The only problem I see is that there's no Boost.Spirit parser for YAML and its syntax is more complex than the JSON one. However, there's an official implementation in C [1], used by all the existing YAML parsers. But I guess a Spirit-based parser is nicer.

One thing you can always do is use the C-based parser to start and then change it to a Boost.Spirit parser later. -- Sohail Somani http://uint32t.blogspot.com

Jeremy Maitin-Shepard

23 Mar 23 Mar

6:45 a.m.

Jeff Garland <jeff@crystalclearsoftware.com> writes:

...

Jeremy Maitin-Shepard wrote:

...
"Kasun Indrasiri" <kasun147@gmail.com> writes: >

...

...
Thus, it is not clear what advantage a JSON-format archive would offer. The most obvious use for JSON is for communicating with a program written in JavaScript, but then it would be necessary to follow a particular format so that the JavaScript program could do something useful with the data, and therefore Boost Serialization is not the right tool for the job.

...

Programs written in C++ often need to inter communicate with programs written in other languages. Or they need to store data structures in a form that can be processed by programs written in another language. JSON is now a common format for doing this and has parsers any many languages. Last but not least, even if the program is all C++ some folks would prefer a recognizable and widely used format -- the serialization 'proprietary formats' don't qualify on that score.

There are already is the XML archive support, but that is still a "Boost serialization proprietary format". Likewise, using JSON syntax in place of XML would still result in a "Boost serialization proprietary format". I'd certainly agree that JSON I/O facilities in C++ are useful, but I don't think that then sticking Boost Serialization on top of those facilities would be very useful. -- Jeremy Maitin-Shepard

Esteve Fernandez

22 Mar 22 Mar

5:22 p.m.

El Sábado 22 Marzo 2008 17:55:13 Jeremy Maitin-Shepard escribió:

...

Thus, it is not clear what advantage a JSON-format archive would offer. The most obvious use for JSON is for communicating with a program written in JavaScript, but then it would be necessary to follow a particular format so that the JavaScript program could do something useful with the data, and therefore Boost Serialization is not the right tool for the job.

Although the JSON name doesn't hide its JavaScript roots, it's no longer constrained to that niche. JSON has found wide acceptance in other languages: Python has at least 4 different parsers and Robin linked to a Perl parser in a previous message, not to mention the Ruby, Java and PHP ones as well. Boost.serialization not only deals with serialization, but with the inverse operation, that is transforming a previously serialized object in JSON back to its original form. So, what advantages can a JSON archive provide? Plenty, but the most common one is that it's more lightweight than the XML archive. If you take a look at another proposed GSoC project [1], using JSON as a transport protocol/format can benefit in less traffic while maintaining some degree of flexibility, since you could encode the messages in a different language if you feel like it (not necessarily C++) BTW, also it's one of the projects proposed by Boost [2] Cheers. PS: Kasun, it's look like we're going to apply for the same project, so if you have in mind something bigger about JSON that we both could work on, it would be great 1 - http://lists.boost.org/Archives/boost/2008/03/134589.php 2 - http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Google_Summer...

Rene Rivera

8:03 p.m.

Esteve Fernandez wrote:

...

PS: Kasun, it's look like we're going to apply for the same project, so if you have in mind something bigger about JSON that we both could work on, it would be great

You do know that Google prohibits multi-student cooperative projects? -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org (msn) - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim,yahoo,skype,efnet,gmail

Esteve Fernandez

8:55 p.m.

El Sábado 22 Marzo 2008 21:03:18 Rene Rivera escribió:

...

Esteve Fernandez wrote:

...
PS: Kasun, it's look like we're going to apply for the same project, so if you have in mind something bigger about JSON that we both could work on, it would be great

You do know that Google prohibits multi-student cooperative projects?

Didn't know about that rule, should have read the program FAQs in more detail beforehand. Thanks for pointing this out. Cheers.

Kasun Indrasiri

23 Mar 23 Mar

4:48 p.m.

Hi Esteve and others, On 3/23/08, Rene Rivera <grafikrobot@gmail.com> wrote:

...

Esteve Fernandez wrote:

...
PS: Kasun, it's look like we're going to apply for the same project, so if you have in mind something bigger about JSON that we both could work on, it would be great

You do know that Google prohibits multi-student cooperative projects?

Certainly... But anyway we can share our experiences and ideas through the list, regardless of who is doing the project for GSoC. And also I do have few doubts regrading the which mailing list to be used for JSON Archives project. Can we use boost main list or the spirit mailing list for this purpose? Kasun.

Jeremy Maitin-Shepard

6:50 a.m.

Esteve Fernandez <esteve@sindominio.net> writes:

...

El Sábado 22 Marzo 2008 17:55:13 Jeremy Maitin-Shepard escribió:

...
Thus, it is not clear what advantage a JSON-format archive would offer. The most obvious use for JSON is for communicating with a program written in JavaScript, but then it would be necessary to follow a particular format so that the JavaScript program could do something useful with the data, and therefore Boost Serialization is not the right tool for the job.

...

Although the JSON name doesn't hide its JavaScript roots, it's no longer constrained to that niche. JSON has found wide acceptance in other languages: Python has at least 4 different parsers and Robin linked to a Perl parser in a previous message, not to mention the Ruby, Java and PHP ones as well.

...

Boost.serialization not only deals with serialization, but with the inverse operation, that is transforming a previously serialized object in JSON back to its original form.

Sure, but fundamentally, it will always use a "Boost serialization proprietary format", regardless of what syntax is used to encode that format.

...

So, what advantages can a JSON archive provide? Plenty, but the most common one is that it's more lightweight than the XML archive.

If you want lightweight, there is the text archive format or the binary archive format, but both of those formats are likewise "Boost serialization proprietary formats", which would make them not very useful for interoperation with other programs. [snip] -- Jeremy Maitin-Shepard

Esteve Fernandez

10:55 a.m.

El Domingo 23 Marzo 2008 07:50:44 Jeremy Maitin-Shepard escribió:

...

Sure, but fundamentally, it will always use a "Boost serialization proprietary format", regardless of what syntax is used to encode that format.

Yes, you're right. There's a couple of things that Boost.serialization implements that are not found in JSON serializers for other languages, such as class versioning or object tracking (this can be solved in YAML using &, but JSON doesn't support it). Anyway, dealing with these issues and making them non-intrusive so JSON serializers found in other languages can parse the serialized objects, would be a part of the GSoC project. Cheers.

6333

Age (days ago)

6336

Last active (days ago)

List overview

Download

17 comments

7 participants

participants (7)

Esteve Fernandez
Jeff Garland
Jeff Garland
Jeremy Maitin-Shepard
Kasun Indrasiri
Rene Rivera
Sohail Somani