
I'd like to go back to Chuck's original posting:
Inspection of the code reveals the following is happening (both in 1.34 and HEAD): text_iarchive_impl<Archive>::load(std::string &s) { std::size_t size; * this->This() >> size; // skip separating space is.get(); // borland de-allocator fixup #if BOOST_WORKAROUND(_RWSTD_VER, BOOST_TESTED_AT(20101)) if(NULL != s.data()) #endif s.resize(size); is.read(const_cast<char *>(s.data()), size); // <== This looks suspect }
Inspection of the g++ string implementation (latest 4.1.x) shows that resize() will *almost* always create a new rep and thus a new data() for the string, if it is shared. The only exception is when the requested size == s.size(). Chuck, was this the bug that hit you? Seems to me, since you are going to rewrite your string anyway, you might as well clear() it first, then resize(size). That'll take care of creating a new representation. The problem with that is that it will write over the string twice, first filling it with 0s then with the read data. But think about it: if the string is shared, you need to construct a new representation with the requested size anyway, and it will have to be filled with 0s. Still, suppose you wanted to avoid going over your string twice, what you'd need is to create a string with an unitialized buffer and start writing into it directly. There's only one way I can think of to achieve that: have an input iterator from your "is" stream, such as std::istream_iterator<char>(is), and use s.assign(is_begin, is_end), assuming you can define and end that will stop reading after size chars (don't even know how to do that with istream_iterator, and there is no assign(int, InputIterator) method in string, sadly). In any case, however, you then lose the benefit of the single is.read() call, because the iterator is going to do ++iter and read character by character. If your stream is buffered, there shouldn't much of an impact, but then you have a double copy from stream buffer to string buffer anyway, which is akin to going over your string twice. In short, I don't think you can avoid the cost of going over your string twice, so I can't see anything better than: text_iarchive_impl<Archive>::load(std::string &s) { std::size_t size; * this->This() >> size; // skip separating space is.get(); // borland de-allocator fixup #if BOOST_WORKAROUND(_RWSTD_VER, BOOST_TESTED_AT(20101)) if(NULL != s.data()) #endif + s.clear(); s.resize(size); is.read(const_cast<char *>(s.data()), size); // <== This looks suspect } -- Hervé Brönnimann hervebronnimann@mac.com