[serialization] text archives, strings and embedded nulls

Hello, Currently, output text archives implement string saving like this: template<class Archive> BOOST_ARCHIVE_DECL(void) text_oarchive_impl<Archive>::save(const std::string &s) { const std::size_t size = s.size(); *this->This() << size; this->This()->newtoken(); os << s; } which looks fine, except that some stdlib implementations don't do the "os<<s" right when s contains embedded nulls --basically, output stops at the first embedded null. Such a defective stdlib implementation is for instance, that of MSVC 6.5, and I have had conversations with Linux users leading me to suspect some libstdc++ implementations as well. Besides, the std requirements on operator<<(ostream,string) mandate that this function does a lot more things than simply dump the characters of the string, as text_iarchive_impl<Archive>::load(std::string &) assumes implicitly (reading is done through istream::read.) The attached example shows the problem. It crashes in MSVC 6.5, my hunch is that it'll fail in some other environments as well, though not in every one. Using intermediate buffers of type vector<char> fixes the problem. My suggestion is that text_oarchive_impl<Archive>::save(const std::string &s) be changed so as to dump the string as follows: template<class Archive> BOOST_ARCHIVE_DECL(void) text_oarchive_impl<Archive>::save(const std::string &s) { const std::size_t size = s.size(); *this->This() << size; this->This()->newtoken(); os.write(s.data(),s.size()); } so as to avoid problems with embedded nulls. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Hmmm - a worthy suggestion. I suppose it would need some adjustment for wstring as well. While you're at it, consider xml text which doesn't permit null in it at all. Another pain to be addressed. But if one is concerned about NULL's, why stop there - consider all control characters - then one has to think about codecvt facet. etc. Robert Ramey Joaquín Mª López Muñoz wrote:
Hello,
Currently, output text archives implement string saving like this:
template<class Archive> BOOST_ARCHIVE_DECL(void) text_oarchive_impl<Archive>::save(const std::string &s) { const std::size_t size = s.size(); *this->This() << size; this->This()->newtoken(); os << s; }
which looks fine, except that some stdlib implementations don't do the "os<<s" right when s contains embedded nulls --basically, output stops at the first embedded null. Such a defective stdlib implementation is for instance, that of MSVC 6.5, and I have had conversations with Linux users leading me to suspect some libstdc++ implementations as well. Besides, the std requirements on operator<<(ostream,string) mandate that this function does a lot more things than simply dump the characters of the string, as text_iarchive_impl<Archive>::load(std::string &) assumes implicitly (reading is done through istream::read.)
The attached example shows the problem. It crashes in MSVC 6.5, my hunch is that it'll fail in some other environments as well, though not in every one. Using intermediate buffers of type vector<char> fixes the problem.
My suggestion is that text_oarchive_impl<Archive>::save(const std::string &s) be changed so as to dump the string as follows:
template<class Archive> BOOST_ARCHIVE_DECL(void) text_oarchive_impl<Archive>::save(const std::string &s) { const std::size_t size = s.size(); *this->This() << size; this->This()->newtoken(); os.write(s.data(),s.size()); }
so as to avoid problems with embedded nulls.
Joaquín M López Muñoz Telefónica, Investigación y Desarrollo
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey ha escrito:
Hmmm - a worthy suggestion. I suppose it would need some adjustment for wstring as well. While you're at it, consider xml text which doesn't permit null in it at all. Another pain to be addressed. But if one is concerned about NULL's, why stop there - consider all control characters - then one has to think about codecvt facet. etc.
Yep, I acknowledge my suggestion deals with just a tiny bit of a much bigger problem, but at least it is a step in the right direction, IMHO. Joaquín M López Muñoz Telefónica, Inevstigación y Desarrollo
Robert Ramey
Joaquín Mª López Muñoz wrote:
Hello,
Currently, output text archives implement string saving like this:
template<class Archive> BOOST_ARCHIVE_DECL(void) text_oarchive_impl<Archive>::save(const std::string &s) { const std::size_t size = s.size(); *this->This() << size; this->This()->newtoken(); os << s; }
which looks fine, except that some stdlib implementations don't do the "os<<s" right when s contains embedded nulls --basically, output stops at the first embedded null. Such a defective stdlib implementation is for instance, that of MSVC 6.5, and I have had conversations with Linux users leading me to suspect some libstdc++ implementations as well. Besides, the std requirements on operator<<(ostream,string) mandate that this function does a lot more things than simply dump the characters of the string, as text_iarchive_impl<Archive>::load(std::string &) assumes implicitly (reading is done through istream::read.)
The attached example shows the problem. It crashes in MSVC 6.5, my hunch is that it'll fail in some other environments as well, though not in every one. Using intermediate buffers of type vector<char> fixes the problem.
My suggestion is that text_oarchive_impl<Archive>::save(const std::string &s) be changed so as to dump the string as follows:
template<class Archive> BOOST_ARCHIVE_DECL(void) text_oarchive_impl<Archive>::save(const std::string &s) { const std::size_t size = s.size(); *this->This() << size; this->This()->newtoken(); os.write(s.data(),s.size()); }
so as to avoid problems with embedded nulls.
Joaquín M López Muñoz Telefónica, Investigación y Desarrollo
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
------------------------------------------------------------------------ _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (2)
-
Joaquín Mª López Muñoz
-
Robert Ramey