Interest in Base2/16/32/64 Encoding Library?

Dear Boost Developers, I would like to know if there is interest in incorporating a generic Base2/16/32/64 encoding library into Boost. I am the author of such a libray, hosted at http://code.google.com/p/stlencoders/ and would be willing to contribute and make the effort of "Boostifying" it, if the Boost community is interested in such a thing. There have been several suggestions for a Base64 encoding library on this mailing list in the past. Most of them have been dismissed, since Boost already provides a Base64 implementation with Boost.Serialization's "Dataflow Iterators". Though I appreciate the dataflow iterator's design, from personal experience I think that for the basic task of encoding/decoding, a higher-level interface, e.g. one that takes padding and intermittent whitespace (MIME) into account, would benefit many developers. Currently, the stlencoders library - implements the Base16, Base32 and Base64 encoding schemes as defined in RFC 4648. Base2, i.e. binary encoding, is also supported. - implements encoding and decoding operations as generic algorithms that operate on STL-style iterators - supports different encoding alphabets using custom traits classes - supports different character types - lets the user define if/which non-alphabet characters should be ignored via predicates - provides reasonable performance that matches most "plain C" implementations stlencoders is currently designed as a stand-alone library with no dependencies on Boost (or anything but C++03); it would therefore take some effort and guidance from the community to properly integrate it with the rest of Boost. I would therefore appreciate any feedback regarding your interest in this. Kind Regards, Thomas

On Jul 6, 2012, at 4:13 PM, Thomas Kemmer wrote:
Dear Boost Developers,
I would like to know if there is interest in incorporating a generic Base2/16/32/64 encoding library into Boost.
Yes, this could be useful for my RPC library which currently has to implement its own b64 conversion. Should it be part of boost? I think that base 2, and hex encoding of arbitrary binary data has application in all kinds of programs and I have needed such tools at every job I have ever worked. I have often hand-spun them because we didn't want to add YET ANOTHER DEPENDENCY. So adding it to boost would be a good step. I haven't looked at your library in detail, but the idea in principle is sound.
I am the author of such a libray, hosted at
http://code.google.com/p/stlencoders/
and would be willing to contribute and make the effort of "Boostifying" it, if the Boost community is interested in such a thing.
There have been several suggestions for a Base64 encoding library on this mailing list in the past. Most of them have been dismissed, since Boost already provides a Base64 implementation with Boost.Serialization's "Dataflow Iterators". Though I appreciate the dataflow iterator's design, from personal experience I think that for the basic task of encoding/decoding, a higher-level interface, e.g. one that takes padding and intermittent whitespace (MIME) into account, would benefit many developers.
Higher level interface is good.
Currently, the stlencoders library
- implements the Base16, Base32 and Base64 encoding schemes as defined in RFC 4648. Base2, i.e. binary encoding, is also supported. - implements encoding and decoding operations as generic algorithms that operate on STL-style iterators - supports different encoding alphabets using custom traits classes - supports different character types - lets the user define if/which non-alphabet characters should be ignored via predicates - provides reasonable performance that matches most "plain C" implementations
stlencoders is currently designed as a stand-alone library with no dependencies on Boost (or anything but C++03); it would therefore take some effort and guidance from the community to properly integrate it with the rest of Boost. I would therefore appreciate any feedback regarding your interest in this.
Kind Regards,
Thomas
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Daniel Larimer wrote:
On Jul 6, 2012, at 4:13 PM, Thomas Kemmer wrote:
There have been several suggestions for a Base64 encoding library on this mailing list in the past. Most of them have been dismissed, since Boost already provides a Base64 implementation with Boost.Serialization's "Dataflow Iterators". Though I appreciate the dataflow iterator's design, from personal experience I think that for the basic task of encoding/decoding, a higher-level interface, e.g. one that takes padding and intermittent whitespace (MIME) into account, would benefit many developers.
The "Dataflow Iterators" as I concieved for the serialization library is effectively implemented in boost.range. boost.range wasn't available when I made the serialization library. No doubt that a higher-level interface would be attractive to lot's of people and hence would be a good thing. The real appeal of "Dataflow Iterators" / Boost.Range is as an implementation technique. It shows how simple iterators can be easily composed into more complex ones. So the usage for base64 as well as other uses demonstrate examples of the technique. I believe this is a very effective and "boost like" way of going about things and would be disappointed not to see it used in your higher level interface/library. This techique permits doing something like create an iterator which renders base64 in kanji by composing the dataflow base64 iterator + a roman/kanji iterator adaptor without writing more than a trivial bit of new code. As I said, I would be sorry to see the iterator adaptor composition technique not used. Having said that, I do recognise that the "iterator adaptor composition" in this case has some problems". The main one is that it's hard to resolve the issue where by characters are added as padding. I just checked the code and it's clear you've put a lot of effort into this. So good luck with this. Robert Ramey

I would like to know if there is interest in incorporating a generic Base2/16/32/64 encoding library into Boost.
I am the author of such a libray, hosted at
http://code.google.com/p/stlencoders/
and would be willing to contribute and make the effort of "Boostifying" it, if the Boost community is interested in such a thing.
it would be great if that could be added to as filters to boost.iostreams ... it is listed on this wiki page [1], which seems to be some kind of todo list. cheers, tim [1] https://svn.boost.org/trac/boost/wiki/IostreamsFiltersAndDevices

On Sun, Jul 08, 2012 at 10:48:48AM +0200, Tim Blechmann wrote:
I would like to know if there is interest in incorporating a generic Base2/16/32/64 encoding library into Boost.
it would be great if that could be added to as filters to boost.iostreams ... it is listed on this wiki page [1], which seems to be some kind of todo list.
Not only, I hope. I for one do not wish to depend on such a massive library full of things I really do not care about, just to get such fundamental transformations. I'd prefer a freestanding Boost library that Iostreams could be made to trivially depend on to implement its filters. -- Lars Viklund | zao@acc.umu.se

Lars Viklund wrote:
On Sun, Jul 08, 2012 at 10:48:48AM +0200, Tim Blechmann wrote:
I would like to know if there is interest in incorporating a generic Base2/16/32/64 encoding library into Boost.
it would be great if that could be added to as filters to boost.iostreams ... it is listed on this wiki page [1], which seems to be some kind of todo list.
Not only, I hope.
I for one do not wish to depend on such a massive library full of things I really do not care about, just to get such fundamental transformations.
I'd prefer a freestanding Boost library that Iostreams could be made to trivially depend on to implement its filters.
On could implement this as a standard conforming code_convert facet which could then be attached to any stream. Actually there might be room in boost for a code_convert facet library based on composed/nested iteratator adaptors. Robert Ramey

On Sun, Jul 08, 2012 at 07:53:56AM -0800, Robert Ramey wrote:
Lars Viklund wrote:
On Sun, Jul 08, 2012 at 10:48:48AM +0200, Tim Blechmann wrote:
I would like to know if there is interest in incorporating a generic Base2/16/32/64 encoding library into Boost.
it would be great if that could be added to as filters to boost.iostreams ... it is listed on this wiki page [1], which seems to be some kind of todo list.
Not only, I hope.
I for one do not wish to depend on such a massive library full of things I really do not care about, just to get such fundamental transformations.
I'd prefer a freestanding Boost library that Iostreams could be made to trivially depend on to implement its filters.
On could implement this as a standard conforming code_convert facet which could then be attached to any stream. Actually there might be room in boost for a code_convert facet library based on composed/nested iteratator adaptors.
No, you misunderstand me. Why does it have to only be a stream filter or codecvt facet? What's wrong with some nice light low-level free functions or classes, on top of which you can build whatever fancy abstractions you want. Streams and codecvts are fine and all, but very often way too much fluff around a very simple algorithm that shouldn't need such plumbing. -- Lars Viklund | zao@acc.umu.se

Lars Viklund wrote:
On Sun, Jul 08, 2012 at 07:53:56AM -0800, Robert Ramey wrote:
On could implement this as a standard conforming code_convert facet which could then be attached to any stream. Actually there might be room in boost for a code_convert facet library based on composed/nested iteratator adaptors.
No, you misunderstand me.
Why does it have to only be a stream filter or codecvt facet? What's wrong with some nice light low-level free functions or classes, on top of which you can build whatever fancy abstractions you want.
Streams and codecvts are fine and all, but very often way too much fluff around a very simple algorithm that shouldn't need such plumbing.
I think we're in agreement here. The "dataflow iterators" of the serialization library is what I aways envisioned for this. It's really a "filter construction kit". The filters used by the serialization library (and there a number of them) are constructed by composing simpler ones to create a higher level one. So you only need to include enough code to get what you want. I stopped polishing it when it was good enough build all the filters that the serialization library required. Later boost range came about which is a more complete implementation of this idea. It always seemed to me that it would be interesting to use this approach to compose code_convert facets to taste. But code_convert facets aren't all that popular so the idea. So it's not clear that this would be a worthwhile project. Robert Ramey

On Sun, Jul 8, 2012 at 10:05 PM, Robert Ramey <ramey@rrsd.com> wrote:
It always seemed to me that it would be interesting to use this approach to compose code_convert facets to taste. But code_convert facets aren't all that popular so the idea. So it's not clear that this would be a worthwhile project.
I think a problem with the codecvt approach might be that, although codecvt supports N:M conversion (e.g. 3:4 in case of Bas64), std::basic_filebuf can only use codecvt facets that define a 1:N conversion [1].
From what I read so far, the basic question seems to be whether encoding/decoding should be supported as a "first-class" algorithm (like copy, transform, rotate, etc.) or not. From my experience, the subtleties involved (padding, handling of non-alphabet characters) are easier to handle with the algorithm approach.

Hi, On 07/06/2012 10:13 PM, Thomas Kemmer wrote:
I would like to know if there is interest in incorporating a generic Base2/16/32/64 encoding library into Boost.
Very nice library.
- provides reasonable performance that matches most "plain C" implementations
I have some code which is base64-encoding limited so this interests me particularly. Currently I use OpenSSL's implementation which I found to be the fastest in my case. So I ran a small benchmark against your library and it performed really well, though a bit slower. I manage to get 566.074 Mb/s with OpenSSL versus 517.928 Mb/s with stlencoder. The benchmark simply consists of loading a large file (~200 Mb) in memory and encoding it in a pre-allocated buffer. I've attached the source. Please tell me if you see anything wrong with it. I use g++ 4.7.1 with -03. Best, -- Maxime

On Mon, Jul 9, 2012 at 12:40 PM, Maxime van Noppen <maxime@altribe.org> wrote:
I have some code which is base64-encoding limited so this interests me particularly. Currently I use OpenSSL's implementation which I found to be the fastest in my case. So I ran a small benchmark against your library and it performed really well, though a bit slower. I manage to get 566.074 Mb/s with OpenSSL versus 517.928 Mb/s with stlencoder. The benchmark simply consists of loading a large file (~200 Mb) in memory and encoding it in a pre-allocated buffer. I've attached the source. Please tell me if you see anything wrong with it. I use g++ 4.7.1 with -03.
Thanks for your interest. I don't see anything principally wrong with your code. You may have noticed the src/stlbench.cpp program, which does some basic performance comparison between stlencoders and some other encoding libraries, but not OpenSSL so far. It would be great if you'd give a try and add OpenSSL support to that! Please send me a patch if you find the time. Regards, Thomas

On Fri, Jul 6, 2012 at 9:13 PM, Thomas Kemmer <tkemmer@computer.org> wrote:
Dear Boost Developers,
I would like to know if there is interest in incorporating a generic Base2/16/32/64 encoding library into Boost.
I am the author of such a libray, hosted at
http://code.google.com/p/stlencoders/
and would be willing to contribute and make the effort of "Boostifying" it, if the Boost community is interested in such a thing.
There have been several suggestions for a Base64 encoding library on this mailing list in the past. Most of them have been dismissed, since Boost already provides a Base64 implementation with Boost.Serialization's "Dataflow Iterators". Though I appreciate the dataflow iterator's design, from personal experience I think that for the basic task of encoding/decoding, a higher-level interface, e.g. one that takes padding and intermittent whitespace (MIME) into account, would benefit many developers.
Currently, the stlencoders library
- implements the Base16, Base32 and Base64 encoding schemes as defined in RFC 4648. Base2, i.e. binary encoding, is also supported. - implements encoding and decoding operations as generic algorithms that operate on STL-style iterators - supports different encoding alphabets using custom traits classes - supports different character types - lets the user define if/which non-alphabet characters should be ignored via predicates - provides reasonable performance that matches most "plain C" implementations
stlencoders is currently designed as a stand-alone library with no dependencies on Boost (or anything but C++03); it would therefore take some effort and guidance from the community to properly integrate it with the rest of Boost. I would therefore appreciate any feedback regarding your interest in this.
Kind Regards,
Thomas
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
I know this was over a year ago, but I need to do some base64 encoding/decoding and wish Boost offered it at the outer level (as opposed to using internals of a library). Regards, Pete
participants (7)
-
Daniel Larimer
-
Lars Viklund
-
Maxime van Noppen
-
PB
-
Robert Ramey
-
Thomas Kemmer
-
Tim Blechmann