[serialization] bug in wide character strings

17 Apr 2008

      We've discovered an issue Boost has writing and reading wide character
strings (wchar_t* and std::wstrings) to non-wide character file streams
(std::ifstream and std::ofstream).  The issue stems from the fact that
wide characters are written and read as a sequence of characters (in
text_oarchive_impl.ipp and text_iarchive_impl.ipp, respectively).  For
text streams, an EOF character terminates the reading of a file on
Windows. Some wide characters have EOF (value = 26 decimal) as one of
the bytes so reading that byte causes early termination of the read.  We
have worked around the issue by deriving our own input and output
archives from text_i|oarchive_impl<Archive> and overriding
load_override() and save_override for std::wstring and wchar_t*.  Our
implementation just sequences through the wide characters and writes
them 1 by 1 as wchar_t to the archive.  This isn't very elegant and is
even less readable in the file than the current implementation but does
resolve the problem.

I looked at both Boost 1.34.1 and 1.35 and didn't see a difference in
the implementation here so I'm assuming 1.35 still has the issue.  I've
been working with 1.34.1.  Is this a known issue?  Does 1.35 solve it in
some other subtle way?  Is there a better way that doesn't require us to
derive our own streams?  If not, is there a more elegant way of
implementing the reading and writing of wide characters ourselves?

Thanks in advance.

Jeff Faust

Jeffrey Faust

Robert Ramey

Jeffrey Faust

Robert Ramey

Jeffrey Faust

tags

participants (2)