[rfc] I/O Chain Library, Preview Release

Hi, Back in June I posted some thoughts about what I'd expect from a properly done I/O library.[1] Based on the ensuing discussion I then designed such a library. Over the last months, I've worked hard at implementing the thing. It is with quite a bit of pride, then, that I announce the IOChain preview release 1. You can get the source here: http://stud3.tuwien.ac.at/~e0226430/iochain-preview-1.tar.bz2 There are currently some assumptions in the build files that require a tiny bit of setup. 1) Unpack the source into some directory. 2) Grab the latest Boost trunk. 3) In the unpacked source, make a symlink to the Boost trunk called "realboost". E.g. tar -xjf iochain-preview-1.tar.bz2 cd iochain ln -s /path/to/boost/trunk realboost 4) Add zlib to your user config. If you have zlib where it is easily found, this is as easy as adding this line to ${HOME}/user-config.jam: lib zlib : : <name>z ; I'm sorry for the inconvenience. Documentation is in quickbook format. A generated version is available here: http://stud3.tuwien.ac.at/~e0226430/iochain/ This is a preview release. Its only purpose is to get comments. It has not seen widespread compiler testing yet (only GCC 4.1 on a 64-bit Linux). Features implemented so far: - The infrastructure. - File I/O (POSIX only - a Win32 implementation exists, but hasn't seen a compiler yet, so don't expect it to compile, much less work) - In-memory I/O (fixed buffers and std::vector). - Read-ahead and write-collect buffers (like buffering in the current streams). - Various tools to support backtracking. - Combining bytes into larger types. (Only naive methods currently.) - On-the-fly CRC calculation. - On-the-fly zlib compression. The ultimate goal of the library is to completely replace iostreams. All discussion, suggestions, and criticisms are welcome. Sebastian Redl [1] http://lists.boost.org/Archives/boost/2007/06/123478.php

Sebastian,
Back in June I posted some thoughts about what I'd expect from a properly done I/O library.[1] Based on the ensuing discussion I then designed such a library. Over the last months, I've worked hard at implementing the thing.
It is with quite a bit of pride, then, that I announce the IOChain preview release 1.
This looks really nice and modular allowing to have flexible IO components readily available. One (probably unrelated) thought I had when looking at the documentation was, that it might be useful to allow integration with the new Spirit2 library (consisting out of parser and generator subsystems), allowing to do IO of structured data. I'm not sure if this has to be done on the device or filter level, but the essence would be to be able to read and write arbitrarily formatted data, where the formatting is independent from the data. We did such an integration with the exising iostreams (I'm mentioning this here just to explain what I mean), allowing to write: os << karma::format( stream % ", ", // format description c // data ); where stream invokes an existing operator<< for every element of the data stored in the container 'c' (essentially any container, such as vector, string, iterator_range etc.) and comma separates these elements. The equivalent input looks very similar: is >> qi::parse( stream % ", ", // format description c // data ); this time 'stream' uses the operator>> to 'parse' the data into the container 'c'. Regards Hartmut

Hartmut Kaiser wrote:
This looks really nice and modular allowing to have flexible IO components readily available.
Thank you.
One (probably unrelated) thought I had when looking at the documentation was, that it might be useful to allow integration with the new Spirit2 library (consisting out of parser and generator subsystems), allowing to do IO of structured data. This looks interesting, but way higher-level than I'm currently working on.
Sebastian

Sebastian Redl wrote:
Back in June I posted some thoughts about what I'd expect from a properly done I/O library.[1] Based on the ensuing discussion I then designed such a library. Over the last months, I've worked hard at implementing the thing.
[...] The ultimate goal of the library is to completely replace iostreams.
It looks like it still lacks the serialization capabilities of iostreams, no?

Mathias Gaunard wrote:
Sebastian Redl wrote:
[...] The ultimate goal of the library is to completely replace iostreams.
It looks like it still lacks the serialization capabilities of iostreams, no?
It lacks a lot of the things it needs to replace iostreams. That doesn't change that this is my stated ultimate goal for the library. Again, this is just a preview release. I'm posting it to get comments on what is currently there. Sebastian

Hi Sebastien, Lots of questions spring to mind but I suspect that for most of them your response would be that this is a preview release. So just a bit of general feedback; * there seems to be a lot of work there, * a list of benefits/advantages of IOChain over existing io streams might be good? * some common usage examples might be good? i.e. configuration and/or application-specific "document" * is it possible to have runtime-selectable links in the IO chain? e.g. user selection of the encoding to be used. Cheers. ----- Original Message ----- From: "Sebastian Redl" <sebastian.redl@getdesigned.at> To: <boost@lists.boost.org> Sent: Saturday, December 01, 2007 12:43 PM Subject: [boost] [rfc] I/O Chain Library, Preview Release
Hi,
Back in June I posted some thoughts about what I'd expect from a properly done I/O library.[1] Based on the ensuing discussion I then designed such a library. Over the last months, I've worked hard at implementing the thing.
It is with quite a bit of pride, then, that I announce the IOChain preview release 1.
You can get the source here: http://stud3.tuwien.ac.at/~e0226430/iochain-preview-1.tar.bz2 There are currently some assumptions in the build files that require a tiny bit of setup. 1) Unpack the source into some directory. 2) Grab the latest Boost trunk. 3) In the unpacked source, make a symlink to the Boost trunk called "realboost". E.g. tar -xjf iochain-preview-1.tar.bz2 cd iochain ln -s /path/to/boost/trunk realboost 4) Add zlib to your user config. If you have zlib where it is easily found, this is as easy as adding this line to ${HOME}/user-config.jam: lib zlib : : <name>z ;
I'm sorry for the inconvenience.
Documentation is in quickbook format. A generated version is available here: http://stud3.tuwien.ac.at/~e0226430/iochain/
This is a preview release. Its only purpose is to get comments. It has not seen widespread compiler testing yet (only GCC 4.1 on a 64-bit Linux). Features implemented so far: - The infrastructure. - File I/O (POSIX only - a Win32 implementation exists, but hasn't seen a compiler yet, so don't expect it to compile, much less work) - In-memory I/O (fixed buffers and std::vector). - Read-ahead and write-collect buffers (like buffering in the current streams). - Various tools to support backtracking. - Combining bytes into larger types. (Only naive methods currently.) - On-the-fly CRC calculation. - On-the-fly zlib compression.
The ultimate goal of the library is to completely replace iostreams.
All discussion, suggestions, and criticisms are welcome.
Sebastian Redl
[1] http://lists.boost.org/Archives/boost/2007/06/123478.php
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Scott Woods wrote:
* a list of benefits/advantages of IOChain over existing io streams might be good?
True. I need a rationale section in the docs.
* some common usage examples might be good? i.e. configuration and/or application-specific "document"
I'll see what I can do for the next preview release. Right now I was happy to get a number of essential components working.
* is it possible to have runtime-selectable links in the IO chain? e.g. user selection of the encoding to be used.
Yes. Some components directly support runtime selection of parameters (like primitive serialization rules in the assembler_filter, or external coding in the character coder), and you can always compose chains using the erasure, at the cost of a bit of overhead (of course - runtime choices always impose overhead). For example, the SWF format contains a flag in the header for whether the whole file is gzip-compressed or not. You could read that like this: file_source rawswf(filename); assembly<native_rules, file_source> header_reader(rawswf); octet sig[3], version; header_reader.read(to(sig)); if((sig[0] != 'F' && sig[0] != 'C') || sig[1] != 'W' || sig[2] != 'S') { throw not_swf(); } bool compressed = sig[0] == 'C'; header_reader.read(to(version)); if(unsupported(version)) { throw unsupported_swf(); } if(version < 6 && compressed) { throw bad_swf(); } uint32_t length; header_reader.read(to(length)); source<iobyte> swfbody(compressed ? chain(rawswf) | zlib() : chain(rawswf)); Now you can just continue reading from swfbody. The compression is transparent. (Aside from the loss of the seeking capability.) Sebastian

----- Original Message ----- From: "Sebastian Redl" <sebastian.redl@getdesigned.at> To: <boost@lists.boost.org> Sent: Monday, December 03, 2007 11:48 PM Subject: Re: [boost] [rfc] I/O Chain Library, Preview Release
* some common usage examples might be good? i.e. configuration and/or application-specific "document"
I'll see what I can do for the next preview release. Right now I was happy to get a number of essential components working.
Oh for sure!
* is it possible to have runtime-selectable links in the IO chain? e.g. user selection of the encoding to be used.
Yes. Some components directly support runtime selection of parameters (like primitive serialization rules in the assembler_filter, or external coding in the character coder), and you can always compose chains using the erasure, at the cost of a bit of overhead (of course - runtime choices always impose overhead). For example, the SWF format contains a flag in the header for whether the whole file is gzip-compressed or not. You could read that like this:
file_source rawswf(filename); assembly<native_rules, file_source> header_reader(rawswf); octet sig[3], version; header_reader.read(to(sig)); if((sig[0] != 'F' && sig[0] != 'C') || sig[1] != 'W' || sig[2] != 'S') { throw not_swf(); } bool compressed = sig[0] == 'C'; header_reader.read(to(version)); if(unsupported(version)) { throw unsupported_swf(); } if(version < 6 && compressed) { throw bad_swf(); } uint32_t length; header_reader.read(to(length));
source<iobyte> swfbody(compressed ? chain(rawswf) | zlib() : chain(rawswf));
Now you can just continue reading from swfbody. The compression is transparent. (Aside from the loss of the seeking capability.)
IIUC, the example above appears to be an example of a "header reader" that can be used at runtime to switch code paths inside the application. Cool but not quite what I was asking. Imagine you have an application that saves its configuration in a file using the I/O chain library. You decide that the encoding will be binary to discourage users from fiddling. You further decide that users arent so bad and that an advanced tab will let them switch between binary encoding, XML and others. It might allow for easy exchange of configurations within a user community. I have been working in this area and have stumbled over this issue a few times. And ended up with crude solutions using "switch". Cheers
Sebastian _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Scott Woods wrote:
IIUC, the example above appears to be an example of a "header reader" that can be used at runtime to switch code paths inside the application. Yep, that's what it is. Imagine you have an application that saves its configuration in a file using the I/O chain library. You decide that the encoding will be binary to discourage users from fiddling. You further decide that users arent so bad and that an advanced tab will let them switch between binary encoding, XML and others. It might allow for easy exchange of configurations within a user community.
That sounds like completely changing the serialization format of the configuration. I don't think my streams are going to help you much there. A binary structured format and XML are completely different things. You can implement both on top of my streams, sure, but beyond that I think you're left with having to replace the entire serializer component. That means switch or virtual functions. Virtuals would be a better solution here, I think. Sebastian

----- Original Message ----- From: "Sebastian Redl" <sebastian.redl@getdesigned.at> To: <boost@lists.boost.org> Sent: Thursday, December 06, 2007 10:15 AM Subject: Re: [boost] [rfc] I/O Chain Library, Preview Release
Imagine you have an application that saves its configuration in a file using the I/O chain library. You decide that the encoding will be binary to discourage users from fiddling. You further decide that users arent so bad and that an advanced tab will let them switch between binary encoding, XML and others. It might allow for easy exchange of configurations within a user community.
That sounds like completely changing the serialization format of the configuration. I don't think my streams are going to help you much there. A binary structured format and XML are completely different things. You can implement both on top of my streams, sure, but beyond that I think you're left with having to replace the entire serializer component. That means switch or virtual functions. Virtuals would be a better solution here, I think.
Yes the best I could do as well. Looking forward to your IO chain rationale.

Hello, 2007/12/1, Sebastian Redl <sebastian.redl@getdesigned.at>:
Hi,
Back in June I posted some thoughts about what I'd expect from a properly done I/O library.[1] Based on the ensuing discussion I then designed such a library. Over the last months, I've worked hard at implementing the thing.
I hope you still continue :). I am very interested in that library.
It is with quite a bit of pride, then, that I announce the IOChain preview release 1.
[...]
All discussion, suggestions, and criticisms are welcome.
1) What features do you plan to add before you can post it for review? Have you considered breaking it up in several parts? I mean do you intend to get a complete iostream replacement with the first version? This is obviously a good goal, but even without all iostream features the library seems to be a good addition to boost. 2) Where is the code? I only see: iochain/ iochain/.gitignore iochain/Jamfile iochain/html/ iochain/html/.gitignore iochain/html/boostbook.css iochain/iochain.qbk kind regards Andreas

Andreas Pokorny wrote:
Hello,
Hi. Sorry for not answering for such a long time. I got caught up in a deadline.
1) What features do you plan to add before you can post it for review? Have you considered breaking it up in several parts?
I will very likely break it up in the binary part and the text part. However, I need at least a proof-of-concept implementation of the text part, because I want some components to work in both parts, and the interface I use currently may prove unsuitable for that. The binary part doesn't need many more features, as it is.
I mean do you intend to get a complete iostream replacement with the first version? Oh no. There's the whole formatting thing that I most definitely won't tackle in the first version. 2) Where is the code? I only see: iochain/ iochain/.gitignore iochain/Jamfile iochain/html/ iochain/html/.gitignore iochain/html/boostbook.css iochain/iochain.qbk
Hmm, that's the contents of the lib subdirectory. There should be a boost subdirectory as well. And there should be more in the lib subdirectory. Oh no, the tarball really contains only this stuff. Hehe, I really should have looked at that sooner. :-( Well, whatever. Preview release 2 coming up soon. I just need some testing under Win32. Sebastian Redl
participants (5)
-
Andreas Pokorny
-
Hartmut Kaiser
-
Mathias Gaunard
-
Scott Woods
-
Sebastian Redl